Description
Welcome to the Python for Data Science – NumPy, Pandas & Scikit-Learn course, where you can test your Python programming skills in data science, specifically in NumPy, Pandas and Scikit-Learn.
Some topics you will find in the NumPy exercises:
- working with numpy arrays
- generating numpy arrays
- generating numpy arrays with random values
- iterating through arrays
- dealing with missing values
- working with matrices
- reading/writing files
- joining arrays
- reshaping arrays
- computing basic array statistics
- sorting arrays
- filtering arrays
- image as an array
- linear algebra
- matrix multiplication
- determinant of the matrix
- eigenvalues and eignevectors
- inverse matrix
- shuffling arrays
- working with polynomials
- working with dates
- working with strings in array
- solving systems of equations
Some topics you will find in the Pandas exercises:
- working with Series
- working with DatetimeIndex
- working with DataFrames
- reading/writing files
- working with different data types in DataFrames
- working with indexes
- working with missing values
- filtering data
- sorting data
- grouping data
- mapping columns
- computing correlation
- concatenating DataFrames
- calculating cumulative statistics
- working with duplicate values
- preparing data to machine learning models
- dummy encoding
- working with csv and json filles
- merging DataFrames
- pivot tables
Topics you will find in the Scikit-Learn exercises:
- preparing data to machine learning models
- working with missing values, SimpleImputer class
- classification, regression, clustering
- discretization
- feature extraction
- PolynomialFeatures class
- LabelEncoder class
- OneHotEncoder class
- StandardScaler class
- dummy encoding
- splitting data into train and test set
- LogisticRegression class
- confusion matrix
- classification report
- LinearRegression class
- MAE – Mean Absolute Error
- MSE – Mean Squared Error
- sigmoid() function
- entorpy
- accuracy score
- DecisionTreeClassifier class
- GridSearchCV class
- RandomForestClassifier class
- CountVectorizer class
- TfidfVectorizer class
- KMeans class
- AgglomerativeClustering class
- HierarchicalClustering class
- DBSCAN class
- dimensionality reduction, PCA analysis
- Association Rules
- LocalOutlierFactor class
- IsolationForest class
- KNeighborsClassifier class
- MultinomialNB class
- GradientBoostingRegressor class
This course is designed for people who have basic knowledge in Python, NumPy, Pandas and Scikit-Learn packages. It consists of 330 exercises with solutions. This is a great test for people who are learning the Python language and data science and are looking for new challenges. Exercises are also a good test before the interview. Many popular topics were covered in this course.
If you’re wondering if it’s worth taking a step towards Python, don’t hesitate any longer and take the challenge today.