At a Glance
- Tasks: Dive into Azure Machine Learning and Power BI to predict diabetes scores using linear regression.
- Company: Join a forward-thinking tech company focused on innovative data solutions.
- Benefits: Enjoy flexible working hours, remote work options, and access to cutting-edge technology.
- Why this job: Gain hands-on experience in machine learning while making a real-world impact on health predictions.
- Qualifications: Basic understanding of Python and linear regression is required; no prior experience needed.
- Other info: Perfect for high school and college students eager to learn and grow in data analytics.
The predicted salary is between 36000 - 60000 Β£ per year.
Azure Machine Learning Integrated with Power BI
Get familiar with Machine Learning Workspace. Predict Diabetes Score using Linear Regression.
Prerequisites:
- Requires Azure Subscription for creating Automated ML Workspace
- Install Python on your system
- Understanding of Linear Regression algorithm
Letβs first have a quick overview on Linear Regression and then we will deep dive into the process of creating Automated ML model and this model will get integrated into Power BI. Linear regression is a linear approximation of a relationship between two or more variables. Regressor models are highly used by data scientists to make predictions over continuous numerical values.
Basically, the process of linear regression is mentioned as below:
- Come up with the dataset as per your choice but it should follow some objective of making predictions
- Design Machine Learning Model that works on the dataset
- Make predictions on the dataset (based on Linear regression algorithm)
There is a dependent variable which is called Y being predicted and independent variables X1, X2, X3β¦β¦β¦.Xn. Here x is a predictor and Y is a function of X variables. The simple equation of linear regression is:
Random forest regression is a bagging technique where parts of the main dataset get distributed among multiple Decision Trees that will predict the best model. Based on the root mean square error (RMSE), it will aggregate the best model or choose the best predictive model.
In Random Forest Process, we have some base learner models like M1, M2, M3 .. Mn. These base learner models are called Decision Trees. Each decision tree will randomly pick up the number of rows and columns from the main dataset; the process is called Row Sampling for rows distribution and Feature Sampling for columns distribution. In this way, every base learner/decision tree will have Dβ dataset. This will form a bootstrap model which will be aggregated according to the bagging process.
The workspace is the top level of resource which you need to build to work in Machine Learning environment. Azure provides different types of workspace; according to the needs and requirements, users will create the workspace. In our case, we will create Machine Learning Workspace by following the below steps:
- Go to Azure Portal
- Search for Machine Learning on search bar
- Click on Create to create the workspace
- Provide the workspace name, rest of the details will be set to default as new
- Click on blue button review and create
To Launch the workspace, go to the workspace and click Launch Studio. Here we will get the Machine Learning workspace where we will define the dataset and train the model based on Linear Regression.
Creating Experiments using Automated ML:
Follow the below instructions to create the experiment using Automated ML:
- Click on automated ML option
- Click on New Automated ML run
- Click on the Create New Option and select Automated ML run
- Next step is to choose the dataset, click on Create datasets and select From open datasets
- Now, search for diabetes in the search box and select sample: Diabetes
- Click on Next
- Give the name of the dataset and click on Create button
Now, we have successfully created the dataset. Next, step is to configure the Model:
- Select the Sample: Diabetes and click on Next
- In the next step, we are required to provide some details as below:
- Give the name of the experiment
- Select Target Column Y (Actual value on which the model will make predictions)
- Select compute type as compute cluster
- Select Azure ML compute cluster compute1 (if not pre-built, you need to create a new one)
Now, you may relax and see the magic what Automated ML will prepare for you. This is a code-free platform where you need not to worry about the calculations and the logic behind the model. But basic understanding of the algorithm is required to understand and interpret the results.
Note: It will take approx. 30 min to train the model. When you create an experiment, Automated ML will create multiple models for you. Based on the normalized root mean squared error, we will select our best model i.e. Random Forest and deploy as a web service. Here you need to provide some details like name of the model and compute type. Click on Deploy.
Note: If we do not deploy the model, it will not be visible to Power BI.
Integrate the model into Power BI:
Before integrating the model into Power BI, we will make our Power BI engine compatible with Python. In Power BI Desktop, go to File -> Options and Settings -> Options -> Python Scripting. Now, under the option Detected Python Home Directories, give the folder location where your Python is installed. Install Pandas, Numpy and Matplotlib library using command prompt.
Note: To integrate the model into Power BI, first we need to get the same dataset columns which were passed to our model. Here, records of the table can be different but headers will need to be the same as per our model because our model is trained on those headers and only knows the same column name.
Now, go to the Home Tab -> Transform Data, Power Query Editor window will appear. In Power Query Editor go to Home Tab -> Azure Machine Learning Option. You will get the list of models which have been previously built. Select the model which you have deployed. You can see the Created date and last modified date of the model. As soon as you click on OK button, your model will get loaded into Power BI with the Predictions. You can check the results by changing some records of the dataset. Now, you can see the predicted value as a column in your dataset.
Click on Close & Apply Option. In the Power BI Desktop under the visualization view, drag the python scripting from visuals. In the Fields, select Y and the model value which was loaded into the power query editor. Rename Y as Actual Value and AzureML:DiabetesPrediction as Predicted Value for better understanding of the visual.
Below is the small code which you will need to write under the python script editor to plot the above Line chart.
Data & Analytics employer: IFI Techsolutions Pte Ltd
Contact Detail:
IFI Techsolutions Pte Ltd Recruiting Team
StudySmarter Expert Advice π€«
We think this is how you could land Data & Analytics
β¨Tip Number 1
Familiarise yourself with Azure Machine Learning and Power BI. Understanding how these platforms work together will give you a significant edge during interviews, as you'll be able to discuss your practical knowledge and experiences.
β¨Tip Number 2
Engage in online communities or forums related to data science and machine learning. Networking with professionals in the field can provide insights into the latest trends and may even lead to referrals for job openings.
β¨Tip Number 3
Work on personal projects that involve linear regression and random forest models. Having hands-on experience will not only enhance your skills but also provide you with concrete examples to discuss during interviews.
β¨Tip Number 4
Prepare for technical interviews by practising common data science problems and algorithms. Being able to solve problems on the spot will demonstrate your proficiency and confidence in the subject matter.
We think you need these skills to ace Data & Analytics
Some tips for your application π«‘
Understand the Job Requirements: Before applying, make sure you thoroughly understand the job description for the Data & Analytics position. Familiarise yourself with Azure Machine Learning, Linear Regression, and the integration process with Power BI.
Tailor Your CV: Highlight your relevant experience and skills related to data analytics, machine learning, and any specific tools mentioned in the job description. Use keywords from the job posting to ensure your CV aligns with what the company is looking for.
Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for data analytics and your understanding of the role. Mention specific projects or experiences where you've successfully used machine learning techniques, particularly in Azure or Power BI.
Proofread Your Application: Before submitting, carefully proofread your CV and cover letter for any spelling or grammatical errors. A polished application reflects attention to detail, which is crucial in data analytics roles.
How to prepare for a job interview at IFI Techsolutions Pte Ltd
β¨Brush Up on Linear Regression
Make sure you have a solid understanding of linear regression, as it's a key part of the job. Be prepared to explain how it works and its applications in predicting outcomes, especially in relation to the diabetes score.
β¨Familiarise Yourself with Azure ML
Since the role involves using Azure Machine Learning, take some time to explore the platform. Understand how to create a workspace, run experiments, and deploy models. This knowledge will show your enthusiasm and readiness for the position.
β¨Prepare for Technical Questions
Expect technical questions related to data analytics and machine learning. Be ready to discuss your experience with Python, data manipulation libraries like Pandas and NumPy, and how you would approach building a predictive model.
β¨Showcase Your Problem-Solving Skills
During the interview, highlight your problem-solving abilities. Discuss past projects where you used data analytics to derive insights or solve complex problems. This will demonstrate your practical experience and analytical mindset.