




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This capstone exam assesses practical skills in data retrieval, processing, and visualization using Python. Topics include using libraries like Pandas, Matplotlib, and Seaborn for data manipulation and visualization, as well as best practices for cleaning and preparing data for analysis.
Typology: Exams
1 / 102
This page cannot be seen from the preview
Don't miss anything!





























































































Question 1. Which HTTP status code indicates that the request was successful and a new resource was created? A) 200 B) 201 C) 202 D) 204 Answer: B Explanation: Status code 201 means “Created,” confirming that the request succeeded and a new resource was generated. Question 2. In the requests library, which method is used to send a GET request? A) requests.post() B) requests.put() C) requests.get() D) requests.fetch() Answer: C Explanation: requests.get() sends an HTTP GET request to retrieve data from a specified URL. Question 3. When parsing HTML with BeautifulSoup, which method retrieves all <a> tags from a page? A) soup.find_all('link') B) soup.select('a')
C) soup.find_all('a') D) soup.get('a') Answer: C Explanation: soup.find_all('a') returns a list of all anchor tags in the parsed document. Question 4. Which header field is commonly used to specify the desired response format from an API? A) Content-Type B) Accept C) Authorization D) User-Agent Answer: B Explanation: The Accept header tells the server the MIME types the client can handle, such as application/json. Question 5. What is the primary purpose of OAuth 2.0 in API interactions? A) Encrypt data payloads B) Rate‑limit requests C) Provide token‑based authentication D) Convert XML to JSON Answer: C Explanation: OAuth 2.0 issues access tokens that grant limited, revocable permission to protected resources.
B) lxml C) xml.etree.ElementTree D) csv Answer: C Explanation: xml.etree.ElementTree is part of the standard library and allows easy navigation of XML structures. Question 9. In SQLite, which SQL command adds a new column to an existing table? A) ALTER TABLE table_name ADD COLUMN column_name datatype; B) MODIFY TABLE table_name ADD column_name datatype; C) UPDATE TABLE table_name SET column_name datatype; D) INSERT INTO table_name (column_name) VALUES (datatype); Answer: A Explanation: ALTER TABLE … ADD COLUMN … appends a new column definition to a table schema. Question 10. Which of the following best describes database normalization? A) Encrypting data at rest B) Reducing data redundancy and improving integrity C) Converting tables to CSV files D) Indexing columns for faster queries Answer: B
Explanation: Normalization organizes tables to minimize duplication and maintain logical data relationships. Question 11. Which Python data type is immutable and hashable, making it usable as a dictionary key? A) list B) dict C) tuple D) set Answer: C Explanation: Tuples are immutable and can be hashed, allowing them to serve as dictionary keys. Question 12. What does the strip() string method do? A) Removes whitespace from both ends of a string B) Replaces spaces with underscores C) Splits a string into a list of words D) Converts the string to uppercase Answer: A Explanation: strip() removes leading and trailing whitespace characters from a string. Question 13. Which encoding should be used to correctly read a UTF‑ 8 encoded text file in Python?
Question 16. Which pandas function is most appropriate for detecting rows with missing values? A) df.isnull() B) df.dropna() C) df.fillna() D) df.replace() Answer: A Explanation: df.isnull() returns a boolean DataFrame indicating where values are NaN. Question 17. When scaling numeric features to a 0‑1 range, which technique is used? A) Standardization (z‑score) B) Min‑max scaling C) Binning D) One‑hot encoding Answer: B Explanation: Min‑max scaling transforms values to the interval [0,1] based on the feature’s minimum and maximum. Question 18. Which NLTK function tokenizes a sentence into words? A) nltk.stem.WordNetLemmatizer() B) nltk.tokenize.word_tokenize()
C) nltk.corpus.stopwords() D) nltk.FreqDist() Answer: B Explanation: word_tokenize() splits a sentence into individual word tokens. Question 19. In a simple search engine spider, which data structure is most efficient for tracking URLs that have already been visited? A) List B) Queue C) Set D) Stack Answer: C Explanation: A set provides O(1) lookup time to quickly determine if a URL was processed before. Question 20. Which library is commonly used to create and manipulate graph structures in Python? A) pandas B) matplotlib C) NetworkX D) seaborn Answer: C Explanation: NetworkX offers classes for nodes, edges, and many graph algorithms.
C) sns.distplot() D) sns.barplot() Answer: B Explanation: sns.heatmap() visualizes matrix‑like data with colored cells. Question 24. When visualizing categorical data with many categories, which chart type is generally most readable? A) Scatter plot B) Line chart C) Bar chart D) Histogram Answer: C Explanation: Bar charts display discrete categories along one axis, making comparisons easy. Question 25. Which Matplotlib parameter controls the color of plotted points? A) linewidth B) color or c C) markerstyle D) alpha Answer: B Explanation: The color (or shorthand c) argument sets the point color.
Question 26. Which Seaborn plot is ideal for visualizing the distribution of a single numeric variable? A) sns.boxplot() B) sns.violinplot() C) sns.kdeplot() D) All of the above Answer: D Explanation: Box, violin, and KDE plots each convey distribution characteristics; any can be appropriate. Question 27. Which Python library can generate interactive maps using Leaflet.js? A) Folium B) Basemap C) Cartopy D) Plotly Answer: A Explanation: Folium builds Leaflet maps directly from Python code. Question 28. In NetworkX, which function computes the degree centrality of all nodes? A) nx.betweenness_centrality() B) nx.degree_centrality()
Question 31. Which HTTP method is idempotent and typically used to update an existing resource? A) POST B) GET C) PATCH D) PUT Answer: D Explanation: PUT replaces the target resource with the supplied representation and can be safely repeated. Question 32. In the requests library, how can you set a custom timeout of 10 seconds for a GET request? A) requests.get(url, timeout=10) B) requests.get(url, delay=10) C) requests.get(url, wait=10) D) requests.get(url, limit=10) Answer: A Explanation: The timeout parameter specifies the maximum waiting time for a response. Question 33. Which BeautifulSoup parser is the fastest, though it may require external installation? A) html.parser B) lxml
C) html5lib D) xml.parser Answer: B Explanation: The lxml parser is written in C and offers superior speed compared to the pure‑Python html.parser. Question 34. When dealing with paginated API responses, which HTTP header often indicates the URL of the next page? A) Link B, Location C) Content-Location D) X-Next-Page Answer: A Explanation: The Link header can contain a rel="next" URL for pagination. Question 35. Which SQL clause is used to limit the number of rows returned by a SELECT statement in SQLite? A) LIMIT B) TOP C) ROWCOUNT D) FETCH FIRST Answer: A Explanation: LIMIT n restricts the result set to the first n rows.
C) [\w.-]+@[\w.-]+\.\w+ D) All of the above Answer: D Explanation: All three patterns capture a basic email structure; they differ in allowed characters and strictness. Question 39. In text preprocessing, what is the purpose of stemming? A) Convert words to their base or root form B) Remove punctuation C) Detect language D) Encode text to UTF‑ 8 Answer: A Explanation: Stemming reduces inflected words (e.g., “running”) to a common stem (“run”). Question 40. Which pandas function calculates the frequency of each unique value in a Series? A) value_counts() B) unique() C) count() D) freq() Answer: A Explanation: Series.value_counts() returns a sorted count of distinct values.
Question 41. Which SQL keyword is used to combine rows from two tables based on a related column? A) UNION B) JOIN C) MERGE D) CONNECT Answer: B Explanation: JOIN (e.g., INNER JOIN, LEFT JOIN) merges rows that share matching column values. Question 42. In SQLite, what data type is used to store binary large objects? A) BLOB B) TEXT C) INTEGER D) REAL Answer: A Explanation: BLOB stores raw binary data such as images or files. Question 43. Which Python statement correctly opens a SQLite database file named data.db and creates a cursor? A) conn = sqlite3.connect('data.db'); cur = conn.cursor() B) conn = sqlite3.open('data.db'); cur = conn.create_cursor()
Explanation: add_edge(u, v) creates an undirected (or directed, depending on graph type) connection. Question 46. Which Matplotlib function is used to create a histogram of a numeric array? A) plt.bar() B) plt.hist() C) plt.scatter() D) plt.boxplot() Answer: B Explanation: plt.hist() bins data and displays frequency counts as bars. Question 47. Which Seaborn function creates a pairwise scatter plot matrix? A) sns.pairplot() B) sns.jointplot() C) sns.lmplot() D) sns.scatterplot() Answer: A Explanation: sns.pairplot() visualizes relationships between each pair of variables in a DataFrame. Question 48. When visualizing time‑series data, which Matplotlib function is most appropriate for showing trends? A) plt.bar()
B) plt.plot() C) plt.pie() D) plt.boxplot() Answer: B Explanation: plt.plot() with dates on the x‑axis creates a line chart that clearly shows temporal trends. Question 49. Which parameter of plt.savefig() controls the resolution of the saved image? A) dpi B) size C) format D) quality Answer: A Explanation: The dpi argument sets dots per inch, affecting image clarity. Question 50. In a RESTful API, which HTTP status code indicates that the client must authenticate to gain access? A) 200 B) 301 C) 401 D) 403 Answer: C