Continuum Analytics News: Orange Part I: Building Predictive Models

Developer Blog

PostedWednesday, June 15, 2016

Rahul Jain

Anaconda Intern

Continuum Analytics

In this blog series we will showcase Orange, an open source data visualization and data analysis tool, through two simple predictive models and a Monte Carlo Simulation.

Introduction to Orange

Orange is a comprehensive, component-based framework for machine learning and data mining. It is intended for both experienced users and researchers in machine learning, who want to prototype new algorithms while reusing as much of the code as possible, and for those just entering the ﬁeld who can either write short Python scripts for data analysis or enjoy the powerful, easy-to-use visual programming environment. Orange includes a range of techniques, such as data management and preprocessing, supervised and unsupervised learning, performance analysis and a range of data and model visualization techniques.

Orange has a visual programming front-end for explorative data analysis and visualization called Orange Canvas. Orange Canvas is a visual, component-based programming approach that allows us to quickly explore and analyze data sets. Orange’s GUI is composed of widgets that communicate through channels; a set of connected widgets is called a schema. The creation of schemas is quick and flexible, because widgets are added on through a drag-and-drop method.

Orange can also be used as a Python library. Using the Orange library, it is easy to prototype state-of-the-art machine learning algorithms.

Building a Simple Predictive Model in Orange

We start with two simple predictive models in the Orange canvas and their corresponding Jupyter notebooks.

First let’s take a look at our Simple Predictive Model- Part 1 notebook. Now, let’s recreate the model in the Orange Canvas. Here is the schema for predicting the results of the Iris data set via a classification tree in Orange:

1.png

Notice the toolbar on the left of the canvas- this is where the 100+ widgets can be found and dragged onto the canvas. Now, let’s take a look at how this simple schema works. The schema reads from left to right, with information flowing from widget to widget through the pipelines. After the Iris data set is loaded in, it can be viewed through a variety of widgets. Here, we chose to see the data in a simple data table and a scatter plot. When we click on those two widgets, we see the following:

2.png

3.png

With just three widgets, we already get a sense of the data we are working with. The scatter plot has an option to “Rank Projections,” determining the best way to view our data. In this case, having the scatter plot as “Petal Width vs Petal Length” allows us to immediately see a potential pattern in the width of a flower’s petal and the type of iris the flower is. Beyond scatter plots, there are a variety of different widgets to help us visualize our data in Orange.

Now, let’s look at how we built our predictive model. We simply connected the data to a Classification Tree widget and can view the tree through a Classification Tree Viewer widget.

4.png

We can see exactly how our predictive model works. Now, we connect our model and our data to the “Test and Score” and “Predictions” widgets. The Test and Score widget is one way of seeing how well our Classification Tree performs:

5.png

The Predictions widget predicts the type of iris flower given the input data. Instead of looking at a long list of these predictions, we can use a confusion matrix to see our predictions and their accuracy.

6.png

Thus, we see our model misclassified 3/150 data instances.

We have seen how quickly we can build and visualize a working predictive model in the Orange canvas. Now, let’s take a look at how the exact same model can once again be built via scripting in Orange, a Python 3 data mining library.

Building a Predictive Model with a Hold Out Test Set in Orange

In our second example of a predictive model, we make the model slightly more complicated by holding out a test set. By doing so, we can use separate datasets to train and test our model, thus helping to avoid overfitting. Here is the original notebook.

Now, let’s build the same predictive model in the Orange Canvas. The Orange Canvas will allow us to better visualize what we are building.

Orange Schema:

7.png

As you can tell, the difference between Part 1 and Part 2 is the Data Sampler widget. This widget randomly separates 30% of the data into the testing data set. Thus, we can build the same model, but more accurately test it using data the model has never seen before.

This example shows how easy it is to modify existing schemas. We simply introduced one new widget to vastly improve our model.

Now let’s look at the same model built via the Orange Python 3 library.

Summary

In this blogpost, we have introduced Orange, an open source data visualization and data analysis tool, and presented two simple predictive models. In our next blogpost, we will instruct how to build a Monte Carlo Simulation done with Orange.

Continuum Analytics News: Orange Part I: Building Predictive Models

Developer Blog

Rahul Jain

Introduction to Orange

Building a Simple Predictive Model in Orange

1.png

2.png

3.png

4.png

5.png

6.png

Building a Predictive Model with a Hold Out Test Set in Orange

7.png

Summary

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...