









































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A set of practice questions and answers for the tibco data science predictive analysis exam. It covers key concepts such as predictive analytics, regression models, supervised and unsupervised learning techniques, data wrangling, and model evaluation metrics like rmse and f1-score. The questions test understanding of various aspects of predictive modeling, including data preparation, feature engineering, and model deployment. It is designed to help students and professionals prepare for certification or enhance their knowledge in data science and predictive analytics.
Typology: Exams
1 / 81
This page cannot be seen from the preview
Don't miss anything!










































































Question 1. Which type of analytics predicts future outcomes based on historical data? A) Descriptive Analytics B) Diagnostic Analytics C) Predictive Analytics D) Prescriptive Analytics Answer: C Explanation: Predictive analytics uses historical data to forecast future events or trends, unlike descriptive or diagnostic analytics. Question 2. In the predictive analytics workflow, which step involves understanding business objectives? A) Data Preparation B) Business Understanding C) Modeling D) Deployment Answer: B Explanation: The workflow begins with Business Understanding to ensure analytics align with organizational goals. Question 3. What is the main purpose of regression models in predictive analysis? A) To classify data into categories B) To predict continuous numerical outcomes C) To group similar data points D) To summarize data Answer: B Explanation: Regression models are used for predicting numeric values, such as sales amounts.
Question 4. Which of the following is NOT a supervised learning technique? A) Decision Trees B) K-Means Clustering C) Support Vector Machines D) Logistic Regression Answer: B Explanation: K-Means Clustering is an unsupervised learning method. Question 5. What is the dependent variable in a predictive model called? A) Feature B) Target Variable C) Predictor D) Training Data Answer: B Explanation: The dependent variable is the target variable the model aims to predict. Question 6. Which TIBCO tool allows connection to databases, files, and web services for data integration? A) Information Links B) TERR C) Data Wrangling D) Spotfire Dashboard Answer: A Explanation: Information Links in Spotfire facilitate connections to diverse data sources. Question 7. What is the main goal of data wrangling in predictive modeling?
B) To build and fit the model C) To visualize results D) To deploy the model Answer: B Explanation: Training data is used to teach the model patterns and relationships. Question 11. Which statistical technique tests if observed data deviates from expectations? A) Hypothesis Testing B) Probability Calculation C) Clustering D) Feature Engineering Answer: A Explanation: Hypothesis testing assesses if results are statistically significant. Question 12. In Spotfire, which expression language is used for custom calculations? A) SQL B) Python C) Dxp Expression Language D) Java Answer: C Explanation: Spotfire’s Dxp Expression Language enables custom calculated columns. Question 13. What is the primary use of a confusion matrix? A) To measure regression accuracy B) To visualize model outputs
C) To evaluate classification performance D) To clean data Answer: C Explanation: A confusion matrix shows the number of correct and incorrect predictions for classification models. Question 14. Which metric best evaluates the accuracy of a regression model? A) F1-Score B) RMSE C) Recall D) ROC Curve Answer: B Explanation: RMSE (Root Mean Square Error) measures the average magnitude of prediction errors in regression. Question 15. What is overfitting in predictive modeling? A) Model fits training data too closely B) Model fails to capture patterns C) Model uses too few variables D) Model is slow to deploy Answer: A Explanation: Overfitting occurs when a model learns noise in training data, reducing generalization to new data. Question 16. Which type of model is used for grouping data based on similarity? A) Regression B) Classification
D) Dashboarding Answer: C Explanation: TERR enables executing R scripts for statistical analysis within TIBCO. Question 20. Why is data splitting into training and testing sets important? A) To visualize data B) To prevent overfitting C) To encode variables D) To deploy models Answer: B Explanation: Splitting data ensures that model evaluation is unbiased and prevents overfitting. Question 21. Which method is used to group similar customer records for segmentation? A) Decision Trees B) K-Means Clustering C) Linear Regression D) Time Series Answer: B Explanation: K-Means Clustering groups data by similarity, useful for customer segmentation. Question 22. What does feature engineering involve in predictive analysis? A) Selecting visualization types B) Creating new variables from raw data C) Deploying models D) Saving data
Answer: B Explanation: Feature engineering derives new features to improve model accuracy. Question 23. Which metric shows the proportion of actual positives correctly identified? A) Precision B) Recall C) RMSE D) R-squared Answer: B Explanation: Recall measures the proportion of true positives out of all actual positives. Question 24. What type of model would you use for predicting customer churn (yes/no)? A) Regression B) Classification C) Clustering D) Time Series Answer: B Explanation: Classification models predict categorical outcomes like churn. Question 25. What is the role of aggregation in data preparation? A) Combining multiple rows by summary statistics B) Splitting data into sets C) Encoding categorical variables D) Visualizing model results Answer: A
Question 29. Which visualization is most appropriate for comparing actual vs. predicted values? A) Scatter Plot B) Box Plot C) Lift Chart D) Pie Chart Answer: A Explanation: Scatter plots show the relationship between actual and predicted values, useful for regression. Question 30. What does model drift refer to in deployed analytics? A) Model performance decreases over time B) Model accuracy improves C) Data is split incorrectly D) Visualization changes Answer: A Explanation: Model drift is when a model’s predictive power declines due to changes in underlying data. Question 31. What does encoding categorical variables achieve? A) Converts numeric data to text B) Converts categories to numeric codes for modeling C) Removes missing values D) Aggregates data Answer: B Explanation: Encoding is necessary for algorithms that require numeric input.
Question 32. What is a predictor variable? A) The variable being predicted B) An independent variable used to predict the target C) The model output D) A visualization type Answer: B Explanation: Predictor variables are features that help predict the target variable. Question 33. Which metric is used to assess the proportion of correct predictions in classification? A) Precision B) Recall C) Accuracy D) RMSE Answer: C Explanation: Accuracy measures the percentage of correct predictions in classification. Question 34. What is the main function of Spotfire dashboards? A) Data cleansing B) Interactive visualization and communication of insights C) Running scripts D) Data encoding Answer: B Explanation: Dashboards present model results and enable user interaction. Question 35. What does R-squared ($R^2$) measure in regression analysis?
B) Incremental benefit of using model predictions vs. random selection C) Feature importance D) Model drift Answer: B Explanation: Gain charts show the lift provided by a predictive model over random guessing. Question 39. What is the function of cross-validation in model evaluation? A) Encoding data B) Assessing model performance on multiple data splits C) Aggregating data D) Deploying models Answer: B Explanation: Cross-validation evaluates model stability across different subsets of data. Question 40. In Spotfire, which visualization is used to show data range and outliers? A) Box Plot B) Scatter Plot C) Pie Chart D) Gain Chart Answer: A Explanation: Box plots display data distribution, median, and outliers. Question 41. What is the purpose of marking and filtering in Spotfire dashboards? A) Encoding variables B) Allowing interactive data exploration
C) Model deployment D) Data cleansing Answer: B Explanation: Markings and filters enable dynamic user interaction with visualizations. Question 42. What is the role of pivoting in data preparation? A) Aggregating data B) Restructuring data from long to wide format or vice versa C) Encoding variables D) Model selection Answer: B Explanation: Pivoting changes the arrangement of data for better analysis. Question 43. What is the difference between supervised and unsupervised learning? A) Supervised uses labeled data; unsupervised does not B) Unsupervised uses training and testing sets C) Supervised clusters data D) Unsupervised predicts target variable Answer: A Explanation: Supervised learning requires labeled data, unsupervised learning does not. Question 44. Which method is used for feature importance analysis in Random Forest models? A) Coefficient analysis B) Feature importance scores C) ROC Curve
Answer: A Explanation: Precision evaluates the correctness of positive predictions. Question 48. What is meant by the term ‘deployment’ in predictive analytics? A) Training a model B) Using a model in production for predictions C) Visualizing data D) Encoding variables Answer: B Explanation: Deployment is operationalizing a trained model for business use. Question 49. Which type of model is best for forecasting sales over time? A) Regression B) Classification C) Clustering D) Time Series Answer: D Explanation: Time series models predict values based on historical, time-stamped data. Question 50. What is the main advantage of using combination charts in Spotfire? A) Data cleansing B) Simultaneous visualization of multiple metrics C) Encoding variables D) Model deployment Answer: B
Explanation: Combination charts display several related variables for richer insights. Question 51. What does mean absolute error (MAE) measure in regression? A) Average squared error B) Average absolute difference between predicted and actual values C) Model accuracy D) Feature importance Answer: B Explanation: MAE quantifies the mean magnitude of prediction errors. Question 52. What is the purpose of saving a predictive model in TIBCO? A) To encode variables B) To reuse and share models for future predictions C) To visualize data D) To cleanse data Answer: B Explanation: Saved models can be applied to new data or distributed across teams. Question 53. What is a calculated column in Spotfire? A) Predefined variable B) User-defined column created from expressions C) Model output D) Data connection Answer: B Explanation: Calculated columns are derived using custom expressions for analysis.
Question 57. What is model operationalization in analytics? A) Training the model B) Embedding the model into business workflows C) Data encoding D) Visualization Answer: B Explanation: Operationalization is integrating predictive models into real-world processes. Question 58. What is a target variable in classification? A) Predictor variable B) Categorical outcome to be predicted C) Model output D) Feature importance Answer: B Explanation: The target variable is the label or category the model predicts. Question 59. What does data cleansing involve? A) Constructing visualizations B) Removing or correcting errors, outliers, and inconsistencies C) Modeling data D) Encoding variables Answer: B Explanation: Data cleansing ensures quality and reliability of input data. Question 60. When evaluating a classification model, which metric considers both false positives and false negatives?
A) Accuracy B) ROC Curve C) F1-Score D) RMSE Answer: C Explanation: F1-score combines precision and recall, balancing both types of errors. Question 61. Which feature in Spotfire enables dynamic dashboards with user input controls? A) Property Controls B) Data Wrangling C) Feature Engineering D) Model Deployment Answer: A Explanation: Property controls allow users to interactively change dashboard parameters. Question 62. What does model validation check for? A) Encoding accuracy B) Model’s ability to generalize to new data C) Visualization clarity D) Data cleansing Answer: B Explanation: Validation ensures the model works well on unseen data, not just training data. Question 63. What is the function of data appending in Spotfire? A) Encoding variables