Computer Science II - Lecture 3: Vectors & Sorting, Study notes of Data Structures and Algorithms

A part of the lecture notes for computer science ii, fall 2006. It covers the topics of vectors, sorting, and computing statistics using vectors. Problem-solving examples and code snippets in c++.

Typology: Study notes

Pre 2010

Uploaded on 08/09/2009

koofers-user-lrp-1
koofers-user-lrp-1 🇺🇸

9 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSCI-1200 Computer Science II Fall 2006
Lecture 3 Vectors & Sorting
Review from Lecture 2
Algorithm Analysis & Order Notation
Strings: subscripting and type declarations
Problem solving: two methods of thinking about output along a diagonal.
Today
vector container class,
sort function,
Applications in computing statistics
Reading: Koenig & Moo, Chapter 3
3.1 Problem: Grade Statistics
Read an unknown number of grades and compute various statistics including:
Mean (average)
Standard deviation
Median (middle value)
3.2 Example: Reading Numbers and Computing the Average
#include <fstream>
#include <iomanip>
#include <iostream>
int main(int argc, char* argv[]) {
if (argc != 2) {
std::cerr << "Usage: " << argv[0] << " grades-file\n";
return 1;
}
std::ifstream grades_str(argv[1]);
if (!grades_str) {
std::cerr << "Can not open the grades file " << argv[1] << "\n";
return 1;
}
int count = 0; // Counting and summation variables.
int sum = 0;
int x; // Input variable
// Read in the scores one at a time, updating the sum & count. The
// value of (grades_str >> x) is a reference to the input stream
// grades_str. When we reach the end-of-file OR find something that
// can’t be read into an integer variable, this condition fails.
while (grades_str >> x) {
++count;
sum += x;
}
// Output the result. Set the precision to 3.
std::cout << "The average of " << count << " grades is "
<< std::setprecision(6) << double(sum) / count << std::endl;
return 0;
}
pf3
pf4
pf5
pf8

Partial preview of the text

Download Computer Science II - Lecture 3: Vectors & Sorting and more Study notes Data Structures and Algorithms in PDF only on Docsity!

CSCI-1200 Computer Science II — Fall 2006

Lecture 3 — Vectors & Sorting

Review from Lecture 2

  • Algorithm Analysis & Order Notation
  • Strings: subscripting and type declarations
  • Problem solving: two methods of thinking about output along a diagonal.

Today

  • vector container class,
  • sort function,
  • Applications in computing statistics

Reading: Koenig & Moo, Chapter 3

3.1 Problem: Grade Statistics

  • Read an unknown number of grades and compute various statistics including:
    • Mean (average)
    • Standard deviation
    • Median (middle value)

3.2 Example: Reading Numbers and Computing the Average

#include #include #include

int main(int argc, char* argv[]) {

if (argc != 2) { std::cerr << "Usage: " << argv[0] << " grades-file\n"; return 1; } std::ifstream grades_str(argv[1]); if (!grades_str) { std::cerr << "Can not open the grades file " << argv[1] << "\n"; return 1; }

int count = 0; // Counting and summation variables. int sum = 0; int x; // Input variable

// Read in the scores one at a time, updating the sum & count. The // value of (grades_str >> x) is a reference to the input stream // grades_str. When we reach the end-of-file OR find something that // can’t be read into an integer variable, this condition fails. while (grades_str >> x) { ++count; sum += x; }

// Output the result. Set the precision to 3. std::cout << "The average of " << count << " grades is " << std::setprecision(6) << double(sum) / count << std::endl; return 0; }

3.3 Standard Deviation

  • Definition: if a 0 , a 1 , a 2 ,... , an− 1 is a sequence of n values, and μ is the average of these values, then the standard deviation is (^) [ ∑ n− 1 i=0 (ai^ −^ μ) 2 n − 1

] 12

  • Computing this equation requires two passes through the values:
    • Once to compute the average
    • A second time to compute the standard deviation
  • Thus, we need a way to store the values. The only tool we have so far is arrays. But arrays are fixed in size and we don’t know in advance how many values there will be. This illustrates one reason why we generally will use standard library vectors instead of arrays.

3.4 Vectors

  • Standard library “container class” to hold sequences.
  • A vector acts like a dynamically-sized, one-dimensional array.
  • Capabilities:
    • Holds objects of any type
    • Starts empty unless otherwise specified
    • Any number of objects may be added to the end — there is no limit on size.
    • It can be treated like an ordinary array using the subscripting operator.
    • There is NO automatic checking of subscript bounds.
  • Here’s how we create an empty vector of integers:

vector scores;

  • Vectors are an example of a templated container class. The angle brackets < > are used to specify the type of object (the “template type”) that will be stored in the vector.
  • push back is a vector function to append a value to the end of the vector, increasing its size by one. This is an O(1) operation (on average). - There is NO corresponding push front operation for vectors.
  • size is a function defined by the vector type (the vector class) that returns the number of items stored in the vector.
  • After vectors are initialized and filled in, they may be treated just like arrays.
    • In the line sum += scores[i]; scores[i] is an “r-value”, accessing the value stored at location i of the vector.
    • We could also write statements like scores[4] = 100; to change a score. Here scores[4] is an “l-value”, providing the means of storing 100 at location 4 of the vector.
    • It is the job of the programmer to ensure that any subscript value i that is used is legal —- at least 0 and strictly less than scores.size().

3.6 Median

  • Intuitively, a median value of a sequence is a value that is less than half of the values in the sequence, and greater than half of the values in the sequence.
  • More technically, if a 0 , a 1 , a 2 ,... , an− 1 is a sequence of n values AND if the sequence is sorted such that a 0 ≤ a 1 ≤ a 2 ≤ · · · ≤ an− 1 then the median is  



a(n−1)/ 2 if n is odd

an/ 2 − 1 + an/ 2 2 if^ n^ is even

  • Sorting is therefore the key to finding the median.

3.7 Standard Library Sort Function

  • The standard library has a series of algorithms built to apply to container classes.
  • The prototypes for these algorithms (actually the functions implementing these algorithms) are in header file algorithm.
  • One of the most important of the algorithms is sort.
  • It is accessed by providing the beginning and end of the container’s interval to sort.
  • As an example, the following code reads, sorts and outputs a vector of doubles

double x; std::vector a; while ( std::cin >> x ) a.push_back(x); std::sort( a.begin(), a.end() ); for ( unsigned int i=0; i<a.size(); ++i ) std::cout << a[i] << ’\n’;

  • a.begin() is an iterator referencing the first location in the vector, while a.end() is an iterator referencing one past the last location in the vector. - We will learn much more about iterators in the next few weeks. - Every container has iterators: strings have begin() and end() iterators defined on them.
  • The ordering of values by std::sort is least to greatest (technically, non-decreasing). We will see ways to change this.

3.8 Example: Computing the Median

Note the use of functions and parameter passing in this example:

// Compute the median value of an input set of grades.

#include #include #include #include #include #include

void read_scores(std::vector & scores, std::ifstream & grade_str) { int x; // input variable while (grade_str >> x) { scores.push_back(x); } }

void compute_avg_and_std_dev(const std::vector& s, double & avg, double & std_dev) { // Compute and output the average value. int sum=0; for (unsigned int i = 0; i < s.size(); ++ i) { sum += s[i]; } avg = double(sum) / s.size();

// Compute the standard deviation double sum_sq = 0.0; for (unsigned int i=0; i < s.size(); ++i) { sum_sq += (s[i]-avg) * (s[i]-avg); } std_dev = sqrt(sum_sq / (s.size()-1)); }

double compute_median(const std::vector & scores) { // Create a copy of the vector std::vector scores_to_sort(scores);

// Sort the values in the vector. By default this is increasing order. std::sort(scores_to_sort.begin(), scores_to_sort.end());

// Now, compute and output the median. unsigned int n = scores_to_sort.size();

if (n%2 == 0) // even number of scores return double(scores_to_sort[n/2] + scores_to_sort[n/2-1]) / 2.0; else return double(scores_to_sort[ n/2 ]); // same as (n-1)/2 because n is odd }

int main(int argc, char* argv[]) { if (argc != 2) { std::cerr << "Usage: " << argv[0] << " grades-file\n"; return 1; } std::ifstream grades_str(argv[1]); if (!grades_str) { std::cerr << "Can not open the grades file " << argv[1] << "\n"; return 1; }

std::vector scores; // Vector to hold the input scores; initially empty. read_scores(scores, grades_str); // Read the scores, as before

// Quit with an error message if too few scores. if (scores.size() == 0) { std::cout << "No scores entered. Please try again!" << std::endl; return 1; }

// Compute the average, standard deviation and median double average, std_dev; compute_avg_and_std_dev(scores, average, std_dev); double median = compute_median(scores);

// Output std::cout << "Among " << scores.size() << " grades: \n" << " average = " << std::setprecision(3) << average << ’\n’ << " std_dev = " << std_dev << ’\n’ << " median = " << median << std::endl; return 0; }

3.11 Exercises

  1. After the above code constructing the three vectors, what will be output by the following statement?

cout << a.size() << endl << b.size() << endl << c.size() << endl;

  1. Write code to construct a vector containing 100 doubles, each having the value 55.5.
  2. Write code to construct a vector containing 1000 doubles, containing the values 0, 1,

5, etc. Write it two ways, one that uses push_back and one that does not use push_back.

3.12 Example: Alphabetize Strings

#include #include #include #include #include

int main(int argc, char* argv[]) {

if (argc != 3) { std::cerr << "Usage: " << argv[0] << " names-in names-out\n"; return 1; } std::ifstream names_in_str(argv[1]); if (!names_in_str) { std::cerr << "Can not open the names file " << argv[1] << "\n"; return 1; } std::ofstream names_out_str(argv[2]); if (!names_out_str) { std::cerr << "Can not open the output names file " << argv[2] << "\n"; return 1; }

std::vector<std::string> names; std::string one_name;

// Read the strings one at a time and add them to the back of the vector. while (names_in_str >> one_name) { names.push_back(one_name); }

// The sort function uses (automatically) the < operator which is // defined on strings. This operator compares strings "lexicographically". std::sort(names.begin(), names.end());

names_out_str << "\n" << "Here are the names in alphabetical order." << std::endl; for (unsigned int i=0; i<names.size(); ++i) { names_out_str << names[i] << std::endl; } return 0; }

3.13 Example: Compute the Histogram

#include #include #include #include #include

const int BIN_SIZE = 10;

int main(int argc, char* argv[]) {

if (argc != 2) { std::cerr << "Usage: " << argv[0] << " grades-file\n"; return 1; } std::ifstream grades_str(argv[1]); if (!grades_str) { std::cerr << "Can not open the grades file " << argv[1] << "\n"; return 1; }

std::vector scores; // Vector to hold the input scores; initially empty. int x; // Input variable

// Read the scores, as before while (grades_str >> x) { scores.push_back(x); }

// Quit with an error message if too few scores. if (scores.size() == 0) { std::cout << "No scores entered. Please try again!" << std::endl; return 1; }

// Find the maximum value int max_value = scores[0]; for (unsigned int i=1; i<scores.size(); ++i) { if (scores[i] > max_value) max_value = scores[i]; }

// Establish the number of histogram bins unsigned int num_bins = max_value / BIN_SIZE + 1;

// Initialize the vector called histogram to have size num_bins and // to have a 0 at each entry of the vector. std::vector< int > histogram(num_bins, 0);

// Now fill in the histogram. Each score maps to a location in the histogram. for (unsigned int i=0; i<scores.size(); ++i) { int bin = scores[i] / BIN_SIZE; histogram[ bin ] ++ ; }

// Output the histogram for (unsigned int b=0; b<num_bins; ++b) { int lower = b * BIN_SIZE; int upper = lower + BIN_SIZE - 1; std::cout << ’[’ << std::setw(3) << lower << ".." << std::setw(3) << upper << "): " << std::setw(3) << histogram[b] << ’\n’; }

return 0; // Everything ok }