Lang:
DATA SCIENTIST AND ANALYST
ANDRÉ MARINHO
Portfolio with my data science and data analysis projects. Here you will see a lot of stories with data! If you want to see these and other projects in more detail, check out my Github .
MY PORTFOLIO
01
PROVIDING DATA-DRIVEN SUGGESTIONS FOR HR - TURNOVER PREDICTION

DATA ANALYTICS AND DATA SCIENCE
The HR department at Salifort Motors wants to take some initiatives to improve employee satisfaction levels at the company. They collected data from employees, but now they don’t know what to do with it. Therefore, it was asked for data-driven suggestions based on the understanding of the data. They have the following question: what’s likely to make the employee leave the company?
The goal of this project is to analyze the data collected by the HR department and to build a model that predicts whether or not an employee will leave the company.
02
TIKTOK VIDEO CLASSIFICATION PROJECT

DATA ANALYTICS AND DATA SCIENCE
TikTok is working on the development of a predictive model that can determine whether a video contains a claim or offers an opinion. With a successful prediction model, TikTok can reduce the backlog of user reports and prioritize them more efficiently. A good and accurate classification model can solve this issue for TikTok. This project was made during the Google Advanced Data Analytics Professional Certificate and covers from the project proposal to the final predictive model. You can see the PACE strategy documents and the executive summary for stakeholders on my github.
03
CUSTOMER SATISFACTION PREDICTION FOR AN AIRLINE COMPANY

DATA SCIENCE
The airline is interested in knowing if a better in-flight entertainment experience leads to higher customer satisfaction. Therefore, the construction and evaluation of a model that predicts whether a future customer would be satisfied with their services given previous customer feedback about their flight experience was asked.
For this project, four algorithms were evaluated to predict customer satisfaction: Logistic regression, decision tree, random forest, and XGBoost. The applications and the results can be seen in the links above.
04
SALES ESTIMATION FOR A MARKETING CAMPAIGN

DATA ANALYSIS AND DATA SCIENCE
The marketing team of a company is willing to know which type of campaign is more effective when it comes to the impact on sales. They are currently allocating money to campaigns involving TV, Radio, Social Media, and influencers.
This project aims to analyze the impact of these variables and make a sales estimation by using simple, multiple regression, and applying advanced hypothesis testing such as ANOVA.
05
NEW YORK CITY RESTAURANTS VISUALIZATION ANALYSIS

DATA ANALYTICS
A close look at the restaurant inspections for permitted food establishments in NYC.
06
NBA PLAYERS CAREER DURATION PREDICTION

NBA PLAYERS CAREER DURATION PREDICTION
DATA ANALYTICS AND DATA SCIENCE
The National Basketball Association (NBA) is interested in retaining players who can last in the high-pressure environment of professional basketball and help the team be successful over time. In the first step, a subset of data that contained information about the NBA players and their performance records is analyzed. Feature engineering is conducted to determine which features would most effectively predict a player's career duration. In the second step, those insights are used to build a model (Naive Bayes) that predicts whether a player will have an NBA career lasting five years or more.
07
AIR QUALITY STATISTICAL ANALYSIS

AIR QUALITY STATISTICAL ANALYSIS
DATA ANALYTICS AND DATA SCIENCE
The goal of this project is to use statistical techniques to analyze data for the United States Environmental Protection Agency (EPA) on air quality with respect to carbon monoxide, a major air pollutant. The data includes information from more than 200 sites, identified by state, county, city, and local site names. The project goes from descriptive analysis to hypothesis testing.
08
UNICORN COMPANIES DATA INSIGHTS

UNICORN COMPANIES DATA INSIGHTS
DATA ANALYTICS
The goal of this project is to help an investment firm decide which companies to invest in next, the firm wants insights into unicorn companies–companies that are valued at over one billion dollars. The data provides information on over 1,000 unicorn companies, including their industry, country, year founded, and select investors. This information was used to gain insights into how and when companies reach this prestigious milestone and to make recommendations for the next steps for the investing firm.
09
FINDING AN IDEAL CLUB FOR RICCARDO CALAFIORI

FINDING AN IDEAL CLUB FOR RICCARDO CALAFIORI
FOOTBALL ANALYTICS
In this project, advanced artificial intelligence algorithms were used to have a close look at which clubs would be ideal for the Italian defender.
10
FINDING AN IDEAL RELACEMENT FOR ERLING HAALAND

FINDING AN IDEAL REPLACEMENT FOR ERLING HAALAND
FOOTBALL ANALYTICS
Application of the data in the talent scouting process with the goal of finding an ideal replacement for Erling Haaland.