Welcome to Python for Data Science.

The objective of this module is to provide fundamental understanding of the python programming language needed to follow an introductory course in Data Science.

You will start with the basics of python programming, including python data structures, functions and classes.

We follow this up by an introduction to Numerical Python (NumPy) and finally, the course will provide a basic introduction to linear regression from scratch.

Along the way, we will introduce foundational ideas of statistics, linear algebra and calculus.

At the end of this module, you will have the tools and the concepts needed to successfully undertake a rigorous course in machine learning.

This page introduces you to the team, the basic instructions, the schedule and various elements of our class.

The Team

Hargun Singh Oberoi


Hargun

  • Hargun Oberoi is a Product Manager at Univ.Ai.
  • He has a Masters degree in Mathematics from BITS Pilani University.
  • He is currently working as a research fellow at the StellarDNN lab.

Dr. Pavlos Protopapas


  • Scientific Director of the Institute for Applied Computational Science (IACS).
  • Teaches Introduction to Data Science (CS109a), Advanced Topics in Data Science (CS109b) and Advanced Practical Data Science (AC215).
  • He is a leader in astrostatistics and he is excited about the new telescopes coming online in the next few years.

You can read more about him here.

Teaching Assistants

Click on avatars of the TAs to know more about them.


The Coursework


We have very carefully designed the coursework to give you, the student, a wholesome learning experience.

We will hold two weekend sessions per week for a total of five weeks.

What to expect

Pre-Session
Before the session begins, students are expected to complete a pre-class reading assignment and attempt a quiz.

During Session
During the session, we will have live instruction interspaced with collaborative coding in small groups assisted by our teaching assistants. This will help you develop intuition for the core concepts and provide guidance on technical details.

Post-Session
After the session, students are expected to complete a short post-class quiz based on the principal concepts covered in class.

Course syllabus

The Class

Course schedule

Note:

Session 1 will start at 7:00 PM IST (8:30 AM EST).

Sessions:

Session 1 to 3: 7:30 PM - 9:30 PM IST [09:00 AM - 11:00 AM EST]

Session 4 onwards: 6:30 PM - 08:30 PM IST [09:00 AM - 11:00 AM EST]

Office hours:

  • Thursdays: 7:30 PM - 08:30 PM IST

Sample Class

We believes in the idea of active learning and our course is designed with the expectation of active participation from the students.

Please find a demo of our course style and pedagogy.

Course Topics

Note: Prior knowledge of python programming is not necessary for this module

Session 1 (Basic Python):

  • Introduction to Python
  • Data types, iterators, python operations
  • Order of operations, logical operators

Session 2 (Advanced Python):

  • Python Data Structures - Lists, Dictionaries, Tuples
  • List/dictionary comprehensions
  • Enumeration

Session 3 (Functions):

  • Python Functions - Arguments, keyword arguments, etc.
  • Anonymous functions (lambda function)
  • Function decorators

Session 4 (Classes):

  • Classes: Constructors vs Instantiations
  • Methods vs. Attributes
  • Dunder methods

Session 5 (Strings):

  • Working with strings
  • Reading & writing file
  • Python standard library

Session 6 (NumPy):

  • Debugging skills
  • Exception handling
  • Introduction to Numerical Python (NumPy)

Session 7 (Pandas):

  • Python data analysis
  • Introduction to Pandas
  • Database management using pandas

Session 8 (Probability):

  • Introduction to Random Variable
  • Basic probability simulations
  • Descriptive statistics

Session 9 (Linear Regression):

  • Derivatives (including partial)
  • Linear regression
  • Multi-linear regression

Diversity & Inclusion

We actively seek and welcome people of diverse identities, from across the spectrum of disciplines and methods since Artificial Intelligence (AI) increasingly mediates our social, cultural, economic, and political interactions [1].

We believe in creating and maintaining an inclusive learning environment where all members feel safe, respected, and capable of producing their best work.

We commit to an experience for all participants that is free from – Harassment, bullying, and discrimination which includes but is not limited to:

  • Offensive comments related to age, race, religion, creed, color, gender (including transgender/gender identity/gender expression), sexual orientation, medical condition, physical or intellectual disability, pregnancy, or medical conditions, national origin or ancestry.
  • Intimidation, personal attacks, harassment, unnecessary disruption of talks during any of the learning activities.

Reference:

[1] K. Stathoulopoulos and J. C. Mateos-Garcia, “Gender Diversity in AI Research,” SSRN Electronic Journal, 2019 [Online]. Available: http://dx.doi.org/10.2139/ssrn.3428240.

Education software we use

  • Our lectures and labs are carried out via Zoom (install instructions).
  • Quizzes & exercises will be conducted on the digital learning platform Ed.

All exercises and homeworks in this course will be done in jupyter notebooks. This link will help you setup jupyter lab and get you acquianted with jupyter notebooks.

Our module policies around collaboration and grading are listed here. Our expectations of you are also laid out in that document.


Parting Note

As you will learn in the course, programming for data science is not just about writing efficient code.

It requires proficiency in critical thinking, ideation & experimentation.

Keeping that in mind, you are advised to give your full active attention to every session.

We wish you well for the start of your data science journey.