Data science is a discipline that uses scientific methods, processes, and algorithms to extract meaningful information, knowledge, and insights from structured and unstructured data.
The aim of this course is to provide an introduction to programming for data science, using the Python programming language. The course seeks to introduce the basics of the data science process, from collecting data, pre-processing it (cleaning/correcting it), performing exploratory data analyses, visualizing data, and sharing analysis results.
The course will rely on Jupyter Notebooks for interactive Python programming as they are widely used in Data Science.
Before attending this course, students will need to know:
- the fundamentals of linear algebra: what is a matrix and how matrix addition and multiplication are performed;
- the following fundamental concepts of statistics: mean, median, variance and standard deviation, interquartile range;
- the fundamentals of algebra: real and complex numbers, exponential and logarithm, and trigonometric functions.
If you choose to register for accreditation and assessment, to complete the assignment you will need access to a computer capable of running the open-source software used in the course and access to the Internet. A limited amount of class time will be allocated to working on the class assignment, so you should ensure that you also have access to a computer outside of class.
If you are unable to attend this course in Oxford, an online version is also available.