















































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A practice exam focused on data visualization using python, covering topics such as pandas, matplotlib, and seaborn. It includes multiple-choice questions with detailed explanations, making it a valuable resource for students and professionals looking to test their knowledge and skills in data analysis and visualization. The exam covers essential functions and methods for data manipulation, plotting, and statistical analysis, offering a comprehensive review of key concepts and techniques. This practice exam is designed to help users prepare for certifications or enhance their understanding of data visualization principles.
Typology: Exams
1 / 87
This page cannot be seen from the preview
Don't miss anything!
















































































Question 1. Which pandas function is used to read a CSV file into a DataFrame? A) pd.read_excel() B) pd.read_json() C) pd.read_csv() D) pd.DataFrame.from_csv() Answer: C Explanation: pd.read_csv() directly parses a CSV file and returns a DataFrame, whereas the other functions handle different formats. Question 2. In a DataFrame, which attribute returns the list of column labels? A) df.index B) df.columns C) df.values D) df.axes Answer: B Explanation: df.columns holds the column Index object containing all column names. Question 3. How would you select rows 10 through 20 (inclusive) and columns ‘A’ and ‘B’ using label‑based indexing? A) df.iloc[10:21][['A','B']] B) df.loc[10:20, ['A','B']] C) df.loc[10:20]['A','B'] D) df.iloc[10:20, ['A','B']] Answer: B
Explanation: loc works with labels; df.loc[10:20, ['A','B']] selects the specified rows and columns. Question 4. Which method removes rows containing any NaN values? A) df.dropna(how='any') B) df.fillna() C) df.isnull() D) df.replace(np.nan, None) Answer: A Explanation: dropna(how='any') drops rows where at least one element is missing. Question 5. To replace missing values in column ‘salary’ with the median of that column, you would use: A) df['salary'].fillna(df['salary'].mean()) B) df['salary'].fillna(df['salary'].median(), inplace=True) C) df['salary'].replace(np.nan, df['salary'].median()) D) df.fillna(df['salary'].median()) Answer: B Explanation: fillna with median() and inplace=True substitutes NaNs with the column median. Question 6. Which pandas dtype should be used for a column storing dates and times? A) object B) int C) datetime64[ns] D) float
Answer: A Explanation: The classic IQR rule uses 1.5 × IQR beyond the quartiles. Question 10. To rename column ‘old_name’ to ‘new_name’, which pandas method is appropriate? A) df.rename(columns={'old_name':'new_name'}, inplace=True) B) df.columns['old_name']='new_name' C) df.rename_axis('new_name') D) df.set_axis(['new_name'], axis=1) Answer: A Explanation: rename with columns mapping changes column labels. Question 11. Which groupby aggregation returns the count of rows per group? A) df.groupby('cat').sum() B) df.groupby('cat').size() C) df.groupby('cat').count() D) df.groupby('cat').mean() Answer: C Explanation: count() computes non‑null entries per column; when applied to a single column it gives row counts per group. Question 12. After grouping by ‘department’, you want the average salary per department. Which code is correct? A) df.groupby('department')['salary'].mean() B) df.groupby('salary')['department'].mean() C) df.groupby(['department','salary']).mean()
D) df.groupby('department').agg('salary') Answer: A Explanation: Selecting ‘salary’ after groupby and applying mean() yields the desired average. Question 13. To concatenate two DataFrames vertically (stack rows), you would use: A) pd.concat([df1, df2], axis=1) B) pd.concat([df1, df2], axis=0) C) df1.append(df2) D) Both B and C Answer: D Explanation: Both pd.concat(..., axis=0) and the deprecated append stack rows. Question 14. Which function creates a wide‑format table where each unique value of ‘year’ becomes a column? A) df.melt(id_vars='country') B) df.pivot_table(index='country', columns='year', values='gdp') C) df.stack() D) df.unstack() Answer: B Explanation: pivot_table reshapes data with new columns for each ‘year’. Question 15. Binning a continuous variable ‘age’ into categories ‘young’, ‘mid’, ‘senior’ can be done with: A) pd.cut(df['age'], bins=[0,30,60,120], labels=['young','mid','senior']) B) df['age'].astype('category') C) pd.qcut(df['age'], q=3)
D) All of the above Answer: D Explanation: All three approaches generate a histogram with the specified number of bins. Question 19. Which Matplotlib command adds a grid to the current axes? A) plt.showgrid() B) plt.grid(True) C) ax.add_grid() D) plt.gridline() Answer: B Explanation: plt.grid(True) toggles grid visibility. Question 20. To set the x‑axis label to “Time (s)”, you would use: A) plt.xlabel('Time (s)') B) plt.xaxis('Time (s)') C) ax.set_xlabel('Time (s)') D) Both A and C (depending on context) Answer: D Explanation: Both the pyplot function and the Axes method achieve the same result. Question 21. Which argument changes the line style to a dashed line in plt.plot? A) linestyle='--' B) style='dash' C) line='dashed' D) dash=True
Answer: A Explanation: linestyle='--' specifies a dashed line. Question 22. To add a text annotation at coordinates (2, 5) with the label “Peak”, which code is correct? A) plt.text(2, 5, 'Peak') B) plt.annotate('Peak', xy=(2,5)) C) ax.annotate('Peak', xy=(2,5)) D) All of the above Answer: D Explanation: All three commands place the string “Peak” at the given location (the first as plain text, the latter two as annotations). Question 23. Which parameter of plt.legend controls the location of the legend? A) loc='upper right' B) position='UR' C) anchor='right' D) placement='top' Answer: A Explanation: loc accepts location strings or numeric codes. Question 24. When creating subplots with plt.subplots(2, 3), how many Axes objects are returned? A) 2 B) 3 C) 5
D) sns.heatmap() Answer: B Explanation: jointplot shows a bivariate plot plus marginal histograms or KDEs. Question 28. To compare the distribution of a numeric variable across categories ‘A’, ‘B’, ‘C’, which plot is most appropriate? A) sns.boxplot(x='category', y='value') B) sns.scatterplot(x='category', y='value') C) sns.lineplot(x='category', y='value') D) sns.heatmap() Answer: A Explanation: Boxplots succinctly convey median, quartiles, and outliers per category. Question 29. Which Seaborn plot displays the full distribution shape and also individual observations? A) sns.boxplot() B) sns.violinplot() C) sns.swarmplot() D) sns.stripplot() Answer: C Explanation: swarmplot arranges points to avoid overlap while preserving distribution information. Question 30. A correlation heatmap is generated with sns.heatmap(corr, annot=True). What does the annot=True argument do? A) Adds a color bar
B) Shows the correlation coefficient numbers on each cell C) Normalizes the matrix D) Applies hierarchical clustering Answer: B Explanation: annot=True writes the numeric values inside the heatmap cells. Question 31. Which Seaborn function creates a pairwise scatter matrix for all numeric columns in a DataFrame? A) sns.pairplot(df) B) sns.jointplot(df) C) sns.lmplot(df) D) sns.heatmap(df.corr()) Answer: A Explanation: pairplot produces a grid of scatter plots (and optionally KDEs) for each variable pair. Question 32. To apply a diverging color palette to a heatmap, which Seaborn palette name could you use? A) 'Blues' B) 'RdYlGn' C) 'viridis' D) 'Pastel1' Answer: B Explanation: 'RdYlGn' is a diverging palette transitioning from red through yellow to green. Question 33. Which Seaborn style removes the top and right spines for a cleaner look?
A) px.bar() B) px.scatter_geo() C) px.choropleth() D) px.treemap() Answer: C Explanation: px.choropleth is designed for geographic choropleth visualizations. Question 37. Which Bokeh tool enables hover‑tooltips on a scatter plot? A) HoverTool B) TapTool C) BoxSelectTool D) LassoSelectTool Answer: A Explanation: HoverTool displays custom HTML when the cursor rests on a glyph. Question 38. To launch a simple interactive dashboard using Plotly Dash, which component is essential for layout? A) dcc.Graph B) html.Div C) Both A and B D) dash_core_components.Table Answer: C Explanation: A Dash app typically combines html.Div for structure and dcc.Graph for interactive plots.
Question 39. Which visualization is most suitable for showing how a project’s tasks overlap over time? A) Waterfall chart B) Gantt chart C) Sankey diagram D) Radar chart Answer: B Explanation: Gantt charts display task durations and dependencies on a time axis. Question 40. Which chart type best illustrates the flow of customers between marketing channels and final purchase? A) Sankey diagram B) Pie chart C) Box plot D) Histogram Answer: A Explanation: Sankey diagrams visualize quantity flow between multiple stages. Question 41. Truncated y‑axis can mislead viewers because: A) It hides data points B) It exaggerates differences between values C) It changes the data distribution D) It adds extra grid lines Answer: B Explanation: Cutting off the axis origin can make small differences appear large.
Question 45. To reshape a DataFrame from long to wide format, you would generally use: A) df.melt() B) df.stack() C) df.pivot() D) df.unstack() Answer: C Explanation: pivot (or pivot_table) creates a wide format by turning unique values into columns. Question 46. Which Seaborn function automatically adds a regression line to a scatter plot? A) sns.scatterplot() B) sns.regplot() C) sns.lmplot() D) Both B and C Answer: D Explanation: Both regplot and lmplot fit and display a linear regression line. Question 47. In Plotly, which attribute of a go.Figure enables zooming and panning? A) layout.dragmode B) layout.hovermode C) layout.title D) layout.showlegend Answer: A Explanation: dragmode='zoom' (default) allows users to zoom and pan within the figure.
Question 48. Which library provides the folium.Map class for creating interactive Leaflet maps? A) GeoPandas B) Folium C) Bokeh D) Plotly Answer: B Explanation: Folium wraps the Leaflet.js library for Python map creation. Question 49. To plot a GeoDataFrame gdf with a column ‘population’, you would call: A) gdf.plot(column='population', cmap='OrRd') B) gdf.plot(kind='scatter', x='lon', y='lat') C) gdf.plot(palette='viridis') D) gdf.plot(style='population') Answer: A Explanation: GeoDataFrame.plot with column and cmap creates a choropleth map. Question 50. Which chart type is ideal for visualizing hierarchical data as nested rectangles? A) Sunburst chart B) Treemap C) Network graph D) Sankey diagram Answer: B Explanation: Treemaps partition a rectangle into nested sub‑rectangles representing hierarchical structure.
Explanation: random_state seeds the random number generator, yielding the same split each run. Question 54. Which Matplotlib function creates a bar chart with error bars representing standard deviation? A) plt.bar(x, height, yerr=std) B) plt.errorbar(x, y, yerr=std) C) plt.boxplot(data) D) plt.hist(data, bins=10) Answer: A Explanation: plt.bar accepts yerr to draw vertical error bars for each bar. Question 55. When using sns.heatmap with cmap='coolwarm', what type of data is this palette best suited for? A) Categorical data B) Sequential data C) Diverging data centered around a midpoint D) Binary data Answer: C Explanation: 'coolwarm' is a diverging palette highlighting deviations above and below a neutral point. Question 56. In Plotly Express, which argument maps a categorical variable to different colors? A) color_continuous_scale B) color='category' C) symbol='category'
D) size='category' Answer: B Explanation: color='category' assigns distinct colors to each category level. Question 57. Which Bokeh model is used to create a legend that can be clicked to hide/show glyphs? A) Legend B) HoverTool C) TapTool D) BoxZoomTool Answer: A Explanation: Legend in Bokeh supports interactive click policies to toggle glyph visibility. Question 58. To display a time series with a moving average overlay in Matplotlib, you would most likely compute the average using: A) df.rolling(window=7).mean() B) df.resample('D').sum() C) df.shift(1) D) df.diff() Answer: A Explanation: rolling().mean() computes a moving average over a specified window. Question 59. Which Seaborn function automatically adds a 95% confidence interval band to a line plot? A) sns.lineplot() B) sns.regplot()