Table of Contents
Enroll Here: Data Visualization with Python Cognitive Class Exam Quiz Answers
Introduction to Data Visualization with Python
Data visualization with Python is a powerful way to understand and communicate insights from data. Python offers several libraries, each with its strengths, for creating visually appealing and informative plots.
Libraries for Data Visualization in Python:
- Matplotlib: This is a widely-used 2D plotting library that produces publication-quality figures. It provides a MATLAB-like interface and can create a wide variety of plots, including line plots, scatter plots, bar plots, histograms, etc.
- Seaborn: Built on top of Matplotlib, Seaborn provides a high-level interface for drawing attractive and informative statistical graphics. It simplifies complex visualizations and works seamlessly with Pandas DataFrames.
- Plotly: Known for creating interactive plots, Plotly is great for web-based visualizations. It can generate a wide range of plots, including basic charts, statistical plots, 3D plots, and geographical plots.
- Bokeh: Bokeh is another library for interactive visualizations, but it focuses more on providing interactive plots for web browsers. It’s designed for creating complex plots with large datasets.
- Pandas Visualization: Pandas itself provides basic plotting capabilities directly from DataFrames and Series objects. It’s handy for quick exploratory plots and integrates well with data manipulation tasks.
Getting Started with Data Visualization in Python:
To begin with data visualization in Python, you typically follow these steps:
- Install Required Libraries: Use
pip
orconda
to install Matplotlib, Seaborn, Plotly, Bokeh, and any other libraries you want to use. - Load Your Data: Import your data into Python. Pandas is commonly used for this purpose (
import pandas as pd
). - Choose a Library: Depending on your visualization needs (static vs. interactive, type of plot, etc.), choose the appropriate library (e.g., Matplotlib for basic plots, Seaborn for statistical visualizations, Plotly for interactive plots).
- Create Your Plot: Use the chosen library to create your plot. Each library has its syntax and functions for creating different types of plots.
- Customize and Annotate: Customize your plot as needed: add labels, titles, legends, change colors, etc. Annotations help in conveying insights clearly.
- Display or Save Your Plot: Depending on your environment (Jupyter Notebook, standalone script, web application), display your plot (
plt.show()
for Matplotlib,fig.show()
for Plotly) or save it to a file (plt.savefig()
in Matplotlib).
Conclusion:
Data visualization in Python is a fundamental skill for data scientists and analysts. Choosing the right library depends on the complexity of the data and the type of visualization you want to create. With practice and exploration, you can create compelling visualizations that effectively communicate insights from your data.
Data Visualization with Python Cognitive Class Certificate Answers
Module 1 – Introduction to Visualization Tools Quiz Answers
Question 1: What are the layers that make up the Matplotlib architecture?
- FigureCanvas Layer, Renderer Layer, and Artist Layer.
- Backend_Bases Layer, Artist Layer, Scripting Layer.
- Backend Layer, Artist Layer, and Scripting Layer.
- Backend Layer, FigureCanvas Layer, Renderer Layer, Artist Layer, and Scripting Layer.
- Figure Layer, Artist Layer, and Scripting Layer.
Question 2: Using the inline backend, you can modify a figure after it is rendered.
- False
- True
Question 3: Which of the following are examples of Matplotlib magic functions? Choose all that apply.
- %matplotlib inline
- #matplotlib notebook
- $matplotlib outline
- %matplotlib notebook
- #matplotlib inline
Module 2 – Basic Visualization Tools Quiz Answers
Question 1: Area plots are stacked by default.
- False
- True
Question 2: Given a pandas series, series_data, which of the following will create a histogram of series_data and align the bin edges with the horizontal tick marks?
- count, bin_edges = np.histogram(series_data)
- series_data.plot(kind=’hist’, xticks = count, bin_edges)
- count, bin_edges = np.histogram(series_data)
- series_data.plot(kind=’hist’, xticks = count)
- count, bin_edges = np.histogram(series_data)
- series_data.plot(kind=’hist’, xticks = bin_edges)
- series_data.plot(kind=’hist’)
- count, bin_edges = np.histogram(series_data)
- series_data.plot(type=’hist’, xticks = bin_edges)
Question 3: Given a pandas dataframe, question, which of the following will create a horizontal barchart of the data in question?
- question.plot(type=’bar’, rot=90)
- question.plot(kind=’bar’, orientation=’horizontal’)
- question.plot(kind=’barh’)
- question.plot(kind=’bar’)
- question.plot(kind=’bar’, type=’horizontal’)
Module 3 – Specialized Visualization Tools Quiz Answers
Question 1: Pie charts are less confusing than bar charts and should be your first attempt when creating a visual.
- False
- True
Question 2: What do the letters in the box plot above represent?
- A = Mean, B = Upper Mean Quartile, C = Lower Mean Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers
- A = Mean, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers
- A = Median, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers
- A = Median, B = Third Quartile, C = Mean, D = Inter Quartile Range, E = Lower Quartile, and F = Outliers
- A = Mean, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Maximum
Question 3: What is the correct combination of function and parameter to create a box plot in Matplotlib?
- Function = box, and Parameter = type, with value = “plot”
- Function = boxplot, and Parameter = type, with value = “plot”
- Function = plot, and Parameter = type, with value = “box”
- Function = plot, and Parameter = kind, with value = “boxplot”
- Function = plot, and Parameter = kind, with value = “box”
Module 4 – Advanced Visualization Tools Quiz Answers
Question 1: Which of the choices below will create the following regression line plot, given a pandas dataframe, data_dataframe?
- import seaborn as sns
- ax = sns.regplot(x=”year”, y=”total”, data=data_dataframe, color=”green”)
- data_dataframe.plot(kind=”regression”, color=”green”, marker=”+”)
- import seaborn as sns
- ax = sns.regplot(x=”year”, y=”total”, data=data_dataframe, color=”green”, marker=”+”)
- data_dataframe.plot(kind=”regplot”, color=”green”, marker=”+”)
- import seaborn as sns
- ax = sns.regplot(x=”total”, y=”year”, data=data_dataframe, color=”green”)
Question 2: In Python, creating a waffle chart is straightforward since we can easily create one using the scripting layer of Matplotlib.
- False
- True
Question 3: A word cloud (choose all that apply)
- is a depiction of the frequency of different words in some textual data.
- is a depiction of the frequency of the stopwords, such as a, the, and, in some textual data.
- is a depiction of the meaningful words in some textual data, where the more a specific word appears in the text, bigger and bolder it appears in the word cloud.
- can be generated in Python using the word_cloud library that was developed by Andreas Mueller.
- can be easily created using Matplotlib using the scripting layer.
Module 5 – Creating Maps and Visualizing Geospatial Data Quiz Answers
Question 1: What tile style of Folium maps is usefule for data mashups and exploring river meanders and coastal zones?
- OpenStreetMap
- Mapbox Bright
- Stamen Toner
- Stamen Terrain
- River and Coastal
Question 2: You cluster markers superimposed onto a map in Folium using a feature group object.
- False
- True
Question 3: If you know that the latitude of Spain is 40.4637° N and its longitude is 3.7492° W, and you are interested in generating a map of Spain to visualize its hill shading and natural vegetation. Which of the following lines of code will create the right map for you?
- folium.Map(location=[40.4637, 3.7492], zoom_start=6, tiles=’Stamen Toner’)
- folium.Map(location=[40.4637, 3.7492], zoom_start=6, tiles=’Stamen Terrain’)
- folium.Map(location=[40.4637, -3.7492], zoom_start=6, tiles=’Stamen Terrain’)
- folium.Map(location=[-40.4637, -3.7492], zoom_start=6, tiles=’Stamen Terrain’)
- folium.Map(location=[40.4637, 3.7492], zoom_start=6)
Data Visualization with Python Final Exam Answers
Question 1: Data visualizations are used to (check all that apply):
- explore a given dataset.
- perform data analytics and build predictive models.
- train and test a machine learning algorithm.
- share unbiased representation of data.
- support recommendations to different stakeholders.
Question 2: Matplotlib was created by John Hunter, an American neurobiologist, and was originally developed as an EEG/ECoG visualization tool.
- False
- True
Question 3: One type of Artist object is the primitive type. Which of the following are examples of the primitive type? Check all that apply.
- Rectangle
- Figure
- Axes
- Circle
- Text
Question 4: Using the notebook backend, you can modify a figure after it is rendered.
- False
- True
Question 5: The scripting layer is (check all that apply):
- comprised mainly of pyplot.
- an area on which the figure is drawn.
- a handler of user inputs such as keyboard strokes and mouse clicks.
- lighter that the Artist layer, and is intended for scientists whose goal is to perform quick exploratory analysis.
- comprised one one main object – Artist.
Question 6: Which of the following are instances of the Artist object? Check all that apply.
- Titles
- Event
- FigureCanvas
- Tick Labels
- Images
Question 7: There are three types of Artist objects.
- False
- True
Question 8: Each primitive artist may contain other composite artists as well as primitive artists.
- False
- True
Question 9: Given a pandas dataframe, question, which of the following will create an unstacked area plot of the data in question?
- question.plot(type=’area’, stacked=False)
- question.plot(kind=’area’, unstacked=True)
- question.plot(kind=’area’, stacked=False)
- question.plot(kind=’area’)
- question.plot(type=’area’, unstacked=True)
Question 10: Pie charts are relevant only in the rarest of circumstances, and bar charts are far superior ways to quickly get a message across.
- False
- True
Question 11: What is the correct function, parameter and value input for creating a pie chart in Matplotlib?
- Function = plot, parameter = kind, value = “pie”
- Function = pie, parameter = type, value = “plot”
- Function = plot, parameter = type, value = “pie”
- Function = pie, parameter = kind, value = “plot”
Question 12: What are the five main dimensions of a box plot? Select all that apply.
- Minimum
- Standard Deviation
- Maximum
- First Quartile
- Third Quartile
- Median
- Skewness
Question 13: Which of the lines of code below will create the following scatter plot, given the pandas dataframe, df_total?
- import matplotlib.pyplot as plt
- plot(kind=’scatter’, x=’year’, y=’total’, data=df_total)
- plt.title(‘Total Immigrant population to Canada from 1980 – 2013’)
- plt.label (‘Year’)
- plt.label(‘Number of Immigrants’)
- import matplotlib.pyplot as plt
- df_total.plot(type=’scatter’, x=’year’, y=’total’)
- plt.title(‘Total Immigrant population to Canada from 1980 – 2013’)
- plt.label (‘Year’)
- plt.label(‘Number of Immigrants’)
- import matplotlib.pyplot as plt
- df_total.plot(kind=’scatter’, x=’year’, y=’total’)
- plt.title(‘Total Immigrant population to Canada from 1980 – 2013’)
- plt.xlabel (‘Year’)
- plt.ylabel(‘Number of Immigrants’)
- import matplotlib.scripting.pyplot as plt
- df_total.plot(kind=’scatter’, x=’year’, y=’total’)
- plt.title(‘Total Immigrant population to Canada from 1980 – 2013’)
- plt.label (‘Year’)
- plt.label(‘Number of Immigrants’)
- import matplotlib.scripting.pyplot as plt
- df_total.plot(type=’scatter’, y=’year’, x=’total’)
- plt.title(‘Total Immigrant population to Canada from 1980 – 2013’)
- plt.xlabel (‘Year’)
- plt.ylabel(‘Number of Immigrants’)
Question 14: A bubble plot is a variation of the scatter plot that displays three dimensions of data.
- False
- True
Question 15: Seaborn is a Python visualization library that is built on top of Matplotlib.
- False
- True
Question 16: Which of the choices below will create the following regression line plot, given a pandas dataframe, data_dataframe?
- import seaborn as sns
- ax = sns.regplot(x=”year”, y=”total”, data=data_dataframe, color=”green”)
- data_dataframe.plot(kind=”regression”, color=”green”, marker=”+”)
- import seaborn as sns
- ax = sns.regplot(x=”year”, y=”total”, data=data_dataframe, color=”green”, marker=”+”)
- data_dataframe.plot(kind=”regplot”, color=”green”, marker=”+”)
- import seaborn as sns
- ax = sns.regplot(x=”total”, y=”year”, data=data_dataframe, color=”green”)
Question 17: Which of the following can be accomplished with the package word_cloud in Python? Select all that apply.
- Create a word cloud based on the frequency of different words in some textual data.
- Create a bubble plot based on the word cloud.
- Superimpose the words in a word cloud onto the mask of any shape.
- Import default stop words.
Question 18: The following are tile styles of folium maps (choose all that apply).
- Stamen Terrain
- River Coastal
- Stamen Toner
- Mapbox Bright
- Open Stamen
Question 19: You cluster markers superimposed onto a map in Folium using a marker cluster object.
- False
- True
Question 20: If you know that the latitude and the longitude of Spain are 40.4637° N and 3.7492° W, respectively, and you are interested in generating a map of Spain to explore its river meanders and coastal zones. Which of the following lines of code will create the right map for you?
- folium.Map(location=[40.4637, 3.7492], zoom_start=6, tiles=’Stamen Terrain’)
- folium.Map(location=[40.4637, 3.7492], zoom_start=6, tiles=’Stamen Toner’)
- folium.Map(location=[40.4637, -3.7492], zoom_start=6, tiles=’Stamen Toner’)
- folium.Map(location=[-40.4637, -3.7492], zoom_start=6, tiles=’Stamen Terrain’)
- folium.Map(location=[40.4637, 3.7492], zoom_start=6)