Kaggle Project - HR Analytics

Adrian mutandiro.

About this Project

This case study is an extra exercise to showcase the skills I have acquired during and after the Google Data Analytics Certificate course currently offered on Coursera. The purpose is to utilize the tools the tools of choice to ultimately provide recommendations by following the six steps of the data analysis process: Ask, Prepare, Process, Analyze, Share and Act .

Scenario You are an entry level data analytics hire and your employer Big Tuna has tasked you with finding a possible solution to the high attrition in the firm. Turnover tends to be expensive and time consuming for the company and they want to understand the root cause.

They have provided the Employee Attrition and Factors dataset for you to analyze and figure out any relationships between the variables and attrition rate. Additionally, they expect recommendations at the end of your analysis.

Step 2: Prepare

The second step is to put together the tools needed to copmplete the analysis. My tool of choice in this case is R Studio for both data wrangling and visual analysis .

Step 3: Process

Copy HR_Analytics.csv to a dataframe

The third step for is to prepare the environment for our analysis by loading the libraries we will need:

After that, we can import the dataset. We use the “read_csv()” function to do that since or data has been downloaded in the “csv” format. We can also subsequently display the characteristics of our dataset using the “str()” function to see if any of the variables are out of place/are in the incorrect data type.

Step 3-1: Check HR_Analytics dataset for consistency and general cleaning

So far, the data appears fine. There are a couple of variables stored as the “character” type, but these will be utilized as the “logical” type. This is because even though they are Boolean in nature, they also have a descriptive element to them. the “Gender” variable can be kept the same. However, the “Attrition” and “Overtime” variables should have the same character length for the sake of consistency. Let’s have them in the simpler “Y/N” format instead. The “Yes” and “No” strings can be conditionally replaced by using of the square brackets to manipulate the object elements:

Step 4-1: Visual Analysis - Attrition vs. Department

Now that our dataset is consistent and clean, we can do a number of visual analysis to compare our variables.

Our first analysis will be on department-specific attrition. We want to know where the most attrition happens based on the departments in the dataset. The best visualization medium for proportions of a whole is the pie chart. But first, we need to construct new vectors based on the conditions we are trying to meet. To get the number of people that left based on their department, we need to filter our count based on the dataset “HR_Analytics” for a specific “Department” (Sales, Human Resources or Research & Development) where “Attrition” value equals “Y” .

The department-specific total divided by total attrition multiplied by 100 gives the percentage of workers that left Big Tuna per department. It would also give us a hint on where to focus most of our attention as far as attrition goes. The formulas required to make those calculations are as follows:

Based on the visual above, we can see the department with the highest attrition rate is “Research & Development”. The “Sales” depart is not too far off it seems. “Human Resources” has the lowest rate, so this may not even be necessary to consider as a candidate for trying to solve the attrition issue.

Step 4-2: Visual Analysis - Attrition vs. Business Travel

Though some may enjoy it, travelling for business can be a problem for some. This may be even more true if the employees have families of their own. With this in mind, we can create another pie chart below to see how each of those variables look like when visually represented.

The visual analysis above somewhat aligns with our expectations. The least amount of “Non-Travel” attrition is the lowest. My expectations are subverted a little though because “Travel_Rarely” employee attrition rate is over double the “Travel_Frequently” rate. Correlation is not necessarily causation however, so we can dig into the data a little more to gain more insights.

Step 4-3: Visual Analysis - Attrition vs. Job Satisfaction

Job Satisfaction is a simple enough metric to understand. While various factors determine Job Satisfaction, we are more interested in how the numbers look against attrition levels. We can use the “unique()” to identify all the levels of satisfaction. It appears they range from 1-4.

The visual Analysis shows that there seems to be very little effect of “Job Satisfaction” on attrition. The not one satisfaction level heavily outweighs the others. We can also compare the significance of “Job Satisfaction” to employees that stayed with the company. We just have to change the “Y” condition to “N”.

As suspected, the weight of “Job Satisfaction” still seems to bear little significance on attrition.

Step 4-4: Visual Analysis - Attrition vs. Distance From Home

Another important comparison is distance from home to the workplace. With remote options available, paired with the inconvenience of traveling to an from work, the expectation is higher attrition as the distance goes up. We can also use the “unique()” function here to see how many distances there are. We can use a bar type plot to see how Attrition compares to distance from work. We can also go a step further to compare the interaction between the two against all “Y”s and “N”s:

Step 5: Share

Key Findings

We can now briefly go over the observations for the visual output that appears to carry significance in affecting attrition:

  • The “Sales” and “Research & Development” departments require the most attention to improve employee retention.

Recommendations

  • Holding meetings with the departments that show high levels of attrition is a start. Getting the full picture is only possible by inquiring with those that actually perform tasks in a department.

ggplot2 Piechart

DEV Community

DEV Community

Karthik Bhandary

Posted on Feb 11, 2022

Analysis And Prediction On HR Data Set For Beginners

Are you a newbie when it comes to Data Analysis and Data modelling? If yes, then you are in the right place. In this blog, we are going to be performing some Exploratory Data Analysis on the HR Dataset available in Kaggle. We’ll also be using RandomForest to predict who left their company. This is a beginner-friendly dataset and it is easy to work with. With that out of the box, let’s get into the juicy stuff.

The first thing we always do when we start with our work is to import the libraries. You don’t need to import every single library that you’ll be using in the notebook right at the beginning itself. You can start with the bare minimum and those include:

Just the above are enough when you start working on a problem. From then on you can just add them at the start or you can just import them where ever you are in the notebook. Now that we are done with importing let’s load the data.

Image by author

CHECKING FOR THE NEATNESS OF THE DATA

Now that the data is loaded. Let’s take a look at how the data is, i.e, the data types, the no of NaN values etc., We can do that by using the .info() method

Image by author

EXPLORATORY DATA ANALYSIS (EDA)

In this, we will be visualizing the data. By doing this we will be able to further understand how the data is and if there is any work that is to be done.

Image by author

The chart looks hilarious lol🤣 Don’t know why🤷‍♂️

Image by author

We need to split our data into a training set and test set so that the model doesn’t remember the data by heart. Before doing that let us drop the categorical columns ‘Development’ and ‘salary’.

Here we are passing in a list of col names that we want to drop. By specifying axis=1 we are telling it to remove the column. You can do it like this as well

By doing this you don’t need to specifically assign it df, since it is going to do it in place. Now that we got rid of them let's split our data. First, separate the target var(the one we want to predict) and the rest of the data.

Then we pass the X, y into the following function.

From the name of the function, we can say that it is going to be splitting our data into training and test sets. Now we just need to fit the training data to the model of our choice. Since we are trying to predict we can go with RandomForest. RandomForest can do both prediction and regression. we are also going to be using GridSearchCV to find the best params for our model.

Image by author

In this blog we have seen:

  • A basic workflow of things
  • How to implement RandomForest
  • How to implement GridSearchCV

You can check out my Kaggle notebook here and give it an upvote if you found it helpful

I really hope that you found this analysis helpful and interesting. If you liked my work then don’t forget to follow me on Medium and YouTube , for more content on productivity, self-improvement, Coding, and Tech. Also, check out my works on Kaggle and follow me on LinkedIn . I also write in HackerNoon , you can check that out as well.

Top comments (0)

pic

Templates let you quickly answer FAQs or store snippets for re-use.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink .

Hide child comments as well

For further actions, you may consider blocking this person and/or reporting abuse

danutalexandru profile image

Getting an error when using @ValidateNested decorator in NestJs

Dan Muntean - Apr 22

danmugh profile image

Integrating Wagmi V2 and Rainbowkit in NextJs : A Comprehensive Guide (Part 1)

Dan Mugisho M. - Apr 22

techtobe101 profile image

Developing Resilience and Perseverance: Thriving Despite Setbacks (9/12)

Tech Tobé - Apr 22

hackman78 profile image

Unpacking the Redis Licensing Controversy: A Critical Examination of Open Source Values

Patrick Henry - Apr 22

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Data Preprocessing: Case Study on Employee Attrition using Kaggle Dataset

Profile image of MOHD AMIRULLAH B I N ZAINAL ABIDIN

Employee Attrition is one of the biggest problems faced by few organizations or companies nowadays. This can be critical if the valuable and high performing employees leaving the company suddenly, it affects the productivity of the company, loose the engagement with the client and in turn affects the revenue of the company if the new worker replacing the old employee is not able manage a good rapport with the client. Data preprocessing is a proven data mining method of resolving such issues. It prepares raw data for further processing. This paper will present on the data cleaning, data reduction and data transformation process as part of preprocessing technique in data mining. The dataset used for this study is the Human Resource (HR) Analytics-Predictive Analysis in the www.kaggle.com website. This dataset was uploaded to Kaggle website in year 2018. The main objective of this paper is to extract the dataset and prepare for prediction analysis of employee attrition. Waikato Environment for Knowledge Analysis (WEKA) version 3.8.3, a data mining software and Microsoft Excel were used to generate the analysis.

Related Papers

Mashael Dayel

The discovery of knowledge form criminal evidence databases is important in order to make effective criminological investigation. The aim of data mining is to extract knowledge from database and produce unambiguous and reasonable patterns. K-Nearest Neighbor (KNN) is one of the most successful data mining methods used in classification problems. Many researchers show that combining different classifiers through voting resulted in better performance than using single classifiers. This paper applies KNN to help criminological investigators in identifying the glass type. It also checks if integrating KNN with another classifier using voting can enhance its accuracy in indentifying the glass type. The results show that applying voting can enhance the KNN accuracy in the glass identification problem.

hr analytics case study kaggle

International Research Group - IJET JOURNAL

Karthick Jothivel

Objective: The main objective of the paper is to evaluate the performance level of an employee using Predictive Analytics. Human capital is a major concern for an organization as they want to hire most qualified one who will perform well. This human resource can be used to find the future of the organization. Method: The advanced branch of data engineering is Predictive Analytics is used for this purpose. Generally, these analytics predicts some occurrence or probability based on data. The future occurrence or events, were predicted by analyzing the historical data. Finding: With the help of the rules generated by the decision tree classifier, performance of an employee were found by testing with the attributes. This paper concentrates on gathering information about the employees from the organizational database, based on the analysis of historical data generates an decision tree, validating the attributes of the employee with decision tree. With the latest prediction algorithm, we will predict employees’ performance more efficiently than the existing system.

Suban Ravichandran

In telecommunication industry satisfying customers’ needs plays a vital role to retain them within their network for longer duration of time. A well-known fact in the telecommunication industry is that the competition among industries is very fierce. The acquisition of new and resourceful customers has become difficult and often very expensive. Subsequently customer retention has become more and more important. Data Mining Fox can determine characteristic customer clusters on the basis of collected historic data points from customers - such as for instance the frequency and timely distribution of customer’s usage of services (calls, text messages, MMS, navigation, mail exchange). For each of these customer patterns the company can then offer tailored customer life cycle messages and offers. Implementing the Three-Stage Classifier based Data Mining (3SCDM) approach, an operator can predict churn, incentives may be offered to the customers for successful retention. The proposed system is evaluated by implementing Chi-Square (Chi2) Feature Reduction method along with 3SSCDM approach. Combination of Naive Bayes – RBFNet – RT, Naive Bayes – RBFNet – J48 and Naive Bayes – RBFNet – MLP classifiers are used in Three-Stage Classifier (TSC). On comparing the performance based on accuracy and time taken, Naive Bayes – RBFNet – RT with Chi-Square method performs well by 87.672% and 8.11 secs respectively. This inference can be used for identifying the prospective 3G customers in the network.

Solomon Abraham , Solomon Abraham

World health organization (WHO) confirms that human immuno-Virus (HIV) lowers the protection power of our immune system, leaving our body exposed to other infections and cancers. Such infections are known as “opportunistic" because they take the opportunity to attack us when our immune system is weaker. The infections are called "AIDS related" (Acquired immune-deficiency syndrome) because the infections are observed usually in people who have reached to advanced-stage of HIV infection, known as AIDS. These opportunistic infections can be combined together and considered as co-infection (occurrence of infections as pair or pairs of infections at the same time). The main reason for the cause of death of most people is not by AIDS virus itself rather they die from other related infections (opportunistic infections).People are infected with opportunistic infections earlier than they become infected often with HIV. Among other opportunistic infections that usually co-occur with HIV is Tuberculosis (TB). Ethiopia has adopted the WHO Stop TB strategy, which is reflected in various policy documents and many implementation guidelines of the country. The national TB control program has currently achieved 100 percent of the country’s coverage and more than 90% of government hospitals and health centers offer DOTS (directly observed treatment short course). TB/HIV collaborative activities are being implemented in health centers and hospitals to provide TB patients’ access to HIV testing and HIV care including Anti-retroviral therapy (ART), and reducing TB burden in people living with HIV (PLHIV) through TB screening. TB is characterized by hidden infection another optional diagnostic methods must be used repeatedly. The risk factor of death for co-infected patients’ doubles that of HIV only infected individuals without TB. This study addresses those underlying problem by applying data mining tools and techniques on the ART adult patient data set. Before the cleaning process the original target dataset had 7080 records and 43 attributes. A dataset totaling 2,443 records were used in generating association rules. Initially, all of the records were given with 11 attributes to Apriori, and PredictiveApriori, of association rule mining algorithms, there are eight experiments conducted. Then a number of rules were mined by the two selected algorithms. Rules were selected based on the percentage of confidence and accuracy; both the objective and subjective evaluation process are also performed to select the final four rules. From a data mining point of view, screening of all HIV-infected persons for TB and vice versa requires good co-ordination and communication between the TB and AIDS control programms. Linkage of co-infected patients to antiretroviral treatment centers is critical if early mortality is to be prevented. This study presents an overview of existing diagnostic method in adult ART clinic, and rule-based recommendations for diagnosis of patients with HIV-TB dual infection.

Ashwin Satyanarayana

The amount of data being generated and stored is growing exponentially, owed in part to the continuing advances in computer technology. These data present tremendous opportunities in data mining, a burgeoning field in computer science that focuses on the development of methods that can extract knowledge from data. Recent studies have noted the rise of data mining as a career path with increasing opportunities for graduates. These opportunities are not only available in the private sector; the U.S. government has recently invested $200 million in “big data” research. These suggest the importance for us to teach the tools and techniques that are used in this field. Data mining introduces new challenges for faculty in universities who teach courses in this area. Some of these challenges include: providing access to large real world data for students, selection of tools and languages used to learn data mining tasks, and reducing the vast pool of topics in data mining to those that are c...

Sachin Kala Sidhardhan

RELATED PAPERS

Carlos San Juan Mesonada

Chemistry of Metals and Alloys

Mykhaylo Koterlyn

Nandakumar Haorongbam

Dieter Rucht

The Prostate

Charlotte Gaydos

The American Journal of Pathology

Ulrich Eisel

Proceedings of the SMC Conferences

Thierry Coduys

Ayoub Benatiq

Applied Surface Science

Anshika Singh

abdy abhari

Fahda Nuraini

Jordan S Sly

International Journal of Electrical and Computer Engineering (IJECE)

Mohamed Shamseldin

ISRN Oncology

Kent Angelo

Pediatric Surgery International

Eleni Athanasakos

sigit supadmo arif

Mahmut Başoğlu

Muhammad Dera Purdiansyah

Science of Advanced Materials

Dr. Md. Faruk Hossain

Datasets - Sistema SALVE - ICMBio

Cibele Bonvicino

The breast journal

Seyed Ali Alamdaran

RePEc: Research Papers in Economics

Martin Besfamille

Abortion pills in Kuwait City

Dr M A Rathore

Jurnal Kesehatan Lingkungan Universitas Halu Oleo

sherly wulandari

Journal of Neuroscience Research

Juan Sanchez-ramos

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

The aim of this project was to predict whether a given employee is willing to leave a company. Prediction was made using logistic regression, on HR data set provided by Kaggle

g3nt3lman/Kaggle-HR-Analytics-Case-Study

Folders and files, repository files navigation, kaggle-hr-analytics-case-study.

The aim of this project was to predict whether a given employee is willing to leave a company. Prediction was made using logistic regression, on HR data set provided by Kaggle: https://www.kaggle.com/vjchoudhary7/hr-analytics-case-study/kernels

Detailed description of the project is provided in file "Project description"

IMAGES

  1. HR Analytics Case Study

    hr analytics case study kaggle

  2. HR Analytics Case Study

    hr analytics case study kaggle

  3. HR Analytics: Case Study

    hr analytics case study kaggle

  4. HR Analytics

    hr analytics case study kaggle

  5. HR Employee Analytics

    hr analytics case study kaggle

  6. HR Analytics

    hr analytics case study kaggle

VIDEO

  1. Sleep Health and Lifestyle Prediction

  2. Building Employee Churn Model: a HR Analytics Case Study

  3. TWIML Kaggle Meetup Summer 2020

  4. Data cleaning and data analysis Saturday

  5. Data Analytics Case Study to Analyze Bank Wages Data

  6. HR Analytics case study for Employee attrition

COMMENTS

  1. HR Analytics Case Study

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4.

  2. Powering HR Insights

    Accessing Kaggle's HR Analytics Dataset involves a few steps. Here's a step-by-step guide on how to do it: ... the HR Analytics Case Study dataset, the Human Resources Data Set, and the HR ...

  3. 15 HR Analytics Case Studies with Business Impact

    The full details of this HR analytics case study and the statistical tests can be found here. 7. Achieving an optimum staffing level. Another interesting HR analytics case study was about reaching optimum staffing levels. A large mining company in Zimbabwe was concerned about losing money because of over or understaffed departments.

  4. Human Resources Data Set

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Unexpected end of JSON input. keyboard_arrow_up content_copy.

  5. GitHub

    This case study aims to model the probability of attrition of each employee from the HR Analytics Dataset, available on Kaggle. Its conclusions will allow the management to understand which factors urge the employees to leave the company and which changes should be made to avoid their departure. Topics

  6. Revolutionizing HR Management: A Deep Dive into HR Analytics ...

    The Future of HR Analytics: As technology continues to advance, the future of HR analytics promises even more innovation. From predictive analytics to artificial intelligence-driven insights, HR ...

  7. GitHub

    This case study aims to model the probability of attrition of each employee from the HR Analytics Dataset, available on Kaggle. Its conclusions will allow the management to understand which factors urge the employees to leave the company and which changes should be made to avoid their departure. All the files of this project are saved in a ...

  8. Kaggle Project

    Kaggle Project - HR Analytics Adrian Mutandiro 2023-02-23. About this Project. This case study is an extra exercise to showcase the skills I have acquired during and after the Google Data Analytics Certificate course currently offered on Coursera.

  9. HR Analytics Case Study Collection

    Download free pdf. HR Analytics is a hot topic in HR. Being an emerging field, it's important to show the value it can deliver to organizations. From predicting who will quit, to automated listening during a hostile takeover, here are the 13 case studies that demonstrate what HR professionals can accomplish with the right analytical skills.

  10. 7 HR Data Sets for People Analytics

    1. Absenteeism at work. This enormous HR data set focuses on employee absence. It contains a staggering 8335 rows and 13 columns of data. The data set contains employee numbers and names, gender, city, job title, department, store location, business unit, division, age, length of service, and the number of hour absent.

  11. HR Analytics: Case Study

    HR Analytics: Case Study. Sometimes companies experience high rates of employee attrition when a significant number of employees leave the company within a period of employment. In such cases, the main concern of the HR department is the negative impact of attrition on the company's productivity. Here is a dataset containing information about ...

  12. IBM HR Attrition Case Study. Predicting if a particular employee is

    The IBM HR Attrition Case Study can be found on Kaggle. Python 3.3 is used for analytics and model fitting. The IDE used is Spyder 3.3.3. To properly understand the dataset, let us look at some of its basic features. This includes the shape of the dataset and the type of features/variables present in the data.

  13. HR Case Study

    Employee data set for descriptive and predictive analytics. Employee data set for descriptive and predictive analytics. code. New Notebook. table_chart. New Dataset. tenancy. New Model. emoji_events. New Competition ... Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it.

  14. Analysis And Prediction On HR Data Set For Beginners

    Now that we got rid of them let's split our data. First, separate the target var (the one we want to predict) and the rest of the data. X = df.drop('left', axis=1) # we will predict who left. y = df['left'] Then we pass the X, y into the following function. from sklearn.model_selection import train_test_split.

  15. (PDF) Data Preprocessing: Case Study on Employee Attrition using Kaggle

    For this study, the dataset used is the 'HR Analytics‐Predictive Analysis' prepared by Anupam Majhi, Senior Engineer at HCL Technologies, India [1] . This dataset was uploaded to Kaggle in 2018 in CSV (Comma Separated Values) format. 3.2 Description on domain problems The main objective of this study is to extract the data to predict the ...

  16. Case Study: HR Analytics in Tableau Course

    Analyzing HR Data in Tableau. In this Tableau case study, you will explore a dataset for a fictitious software company called Atlas Labs. This course focuses on helping you import, analyze and visualize Human Resources data in Tableau. Building on your existing knowledge of the platform, you'll learn how to effectively work with Tableau using ...

  17. g3nt3lman/Kaggle-HR-Analytics-Case-Study

    The aim of this project was to predict whether a given employee is willing to leave a company. Prediction was made using logistic regression, on HR data set provided by Kaggle - g3nt3lman/Kaggle-HR-Analytics-Case-Study

  18. Live Training: Solving an HR Analytics Case Study with Power BI

    In this intermediate-level Power BI live training, you'll learn how to solve an interview case study in Power BI using an HR dataset and uncover insights for a fictitious software company. You'll explore the data and build relationships between tables in your data model. You'll then use DAX to create new measures and columns that will ...

  19. Exploratory Data Analysis of IBM HR Attrition Dataset

    Feb 27, 2021. 2. Attrition: When an employee leaves the company due to resignation or retirement, then it is called Attrition. Employees leave the company for personal and professional reasons like retirement, lower growth potential, lower work satisfaction, lower pay rate, bad work environment, etc. Attrition is part and parcel of any business.

  20. HR Analytics KPI Tableau Dashboard Project Tutorial Practice

    HR Analytics Tableau KPI Dashboard Project for Practice | Data Analytics Case StudyIn this tableau dashboard project tutorial, I've discussed about Human Res...

  21. CASE STUDY

    I completed this case study (HR Analytics in Power BI) as part of the Data Analyst in Power BI track on DataCamp, and I gained valuable insights which I have decided to share with the wider data…

  22. HR Analytics- exploration and modelling with R

    Explore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource] code. New Notebook. table_chart. New Dataset. tenancy. New Model. emoji_events. New Competition. corporate_fare. New Organization. No Active Events. Create notebooks and keep track of their status here. add New Notebook. auto_awesome_motion. 0 ...

  23. hr_analytics_case_study_skb

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4.

  24. HR Analysis Case Study

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it. Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4.

  25. Google Data Analytics Case Study

    If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. keyboard_arrow_up. content_copy. SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource]