top of page

Projects

Flight delay prediction

Flights Delays cost thousands of dollars every day, for the airlines and passengers.
The aim is to find what are the reasons of the delays, and try to predict the flight delay of the world's largest low-cost carrier.

  • The main keys of the project are:

  • Exploratory Data Analysis

  • Data Visualization

  • ML models for a regression problem (Decision Tree, K Neighbors, Random Forest, Linear Regression)

  • Appendix with Web Scraping

delay.png

Passengers satisfaction

Continue with our analysis of the world of commercial airlines, let's say now that Southwest Airlines wants to know the predict the satisfaction of their clients.For that, they have given to me a data set that contains a number of surveys about the satisfaction of the airline's passengers. The aim is to predict the passenger satisfaction, if it will be "satisfied" or, "neutral or dissatisfied", and find out what are the most important features for the clients

The main keys of the project are:

  • Exploratory Data Analysis

  • Data Visualization

  • ML models for a classification problem (Decision Tree, Logistic Regression, K Neighbors, Random Forest, AdaBoost)

passengers.png

Premier League: Manchester City analysis with PySpark

The Premier League is the highest level of the England football. 20 teams play each season. The seasons typically run from August to May with each team playing 38 matches (playing all other teams both home and away)

The Premier League is the most-watched sports league in the world, broadcast in 212 territories to 643 million homes.

This project aims to take a look the seasons from 2014/2015 to 2017/2018, and analize the road of Manchester City to the cup.

premierL.png

WebScrapping and Clustering with Premier League

The Premier League is the highest level of the England football. 20 teams play each season.

Premier League is the most-watched sports league in the world, broadcast in 212 territories to 643 million homes.

The idea of this project is using Web Scraping to collect the data, and then apply clustering in order to find similarities and differencies among the players.

The data is from the season 2022/2023. We have data related to the teams and to the players.

We have data about offensive, deffensive, passes and summary stats.


The main keys of this project are:

  • Web Scraping to collect the data and a little ETL process.

  • A little Exploratory Data Analysis.

  • Clustering using K-means.

cluster.png

House Price Prediction

In this project we have a set of residential homes of Ames, Iowa. For each home we have features that describe the home, such as number of garages, fireplaces, etc.

The target of the project is predict the price of the house, so we are in front of a regression problem. We have two files, one for training the model and other for test it.

This project emphasis the Data Preprocessing and Exploratory Data Analysis.

The main keys of the project are:

  • Data preprocessing and cleaning.

  • Data Visualization.

  • Exploratory Data Analysis (Univariate, Bivariate and Multi Analysis).

  • ML models for regression problem.

House-prices-lead.gif
bottom of page