Vector and Sorting: Statistical Computations in Computer Science | CSCI 1200, Study notes of Data Structures and Algorithms

Material Type: Notes; Class: DATA STRUCTURES; Subject: Computer Science; University: Rensselaer Polytechnic Institute; Term: Spring 2006;

Typology: Study notes

Pre 2010

Uploaded on 08/09/2009

koofers-user-pak
koofers-user-pak 🇺🇸

9 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSCI-1200 Computer Science II Spring 2006
Lecture 4 Vectors & Sorting; Statistical Computations
Review from Lecture 3
Strings: subscripting and type declarations
Problem solving: two methods of thinking about output along a diagonal.
4.1 Today’s Class
Koenig & Moo: Chapter 3
vector container class,
sort function,
Applications in computing statistics
4.2 Problem: Grade Statistics
Read an unknown number of grades.
Compute:
Mean (average)
Standard deviation
Median (middle value)
We will write several programs to accomplish these.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Vector and Sorting: Statistical Computations in Computer Science | CSCI 1200 and more Study notes Data Structures and Algorithms in PDF only on Docsity!

CSCI-1200 Computer Science II — Spring 2006

Lecture 4 — Vectors & Sorting; Statistical Computations

Review from Lecture 3

  • Strings: subscripting and type declarations
  • Problem solving: two methods of thinking about output along a diagonal.

4.1 Today’s Class

Koenig & Moo: Chapter 3

  • vector container class,
  • sort function,
  • Applications in computing statistics

4.2 Problem: Grade Statistics

  • Read an unknown number of grades.
  • Compute:
    • Mean (average)
    • Standard deviation
    • Median (middle value)
  • We will write several programs to accomplish these.

4.3 Example: Reading Numbers and Computing the Average

// Program: average.cpp // Author: Chuck Stewart // Purpose: Compute the average of an input set of grades. This // demonstrates input of a sequence of integers, computing the // average, and manipulating the output precision.

#include #include #include

int main( int argc, char* argv[] ) {

if ( argc != 2 ) { std::cerr << "Usage: " << argv[0] << " grades-file\n"; return 1; }

std::ifstream grades_str( argv[1] ); if ( !grades_str ) { std::cerr << "Can not open the grades file " << argv[1] << "\n"; return 1; }

// Counting and summation variables. int count = 0; int sum = 0; // Input variable int x;

// Read in the scores one at a time. Add each score to the sum and // increment the count. // // The value of the expression grades_str >> x is a reference to the // input stream grades_str. The while condition uses this to test // grades_str to see if it is ok. If it is not, the test is false // and the loop ends. The most common cause of this is finding the // end of the input, but it would be false is something other than // an integer (or whitespace), such as a letter, is in the input // stream. while ( grades_str >> x ) { ++ count ; sum += x; }

// Output the result. Set the precision to 3. std::cout << "The average of " << count << " grades is " << std::setprecision(3) << double(sum) / count << std::endl;

return 0; }

4.6 Example: Using Vectors to Compute Standard Deviation

// Program: average_and_deviation.cpp // Author: Chuck Stewart // Purpose: Compute the average and standard deviation of an input // set of grades. This introduces the use of a vector to store // the grades upon input.

#include #include #include #include // to access the STL vector class #include // to use standard math library and sqrt

int main( int argc, char* argv[] ) {

if ( argc != 2 ) { std::cerr << "Usage: " << argv[0] << " grades-file\n"; return 1; } std::ifstream grades_str( argv[1] ); if ( !grades_str ) { std::cerr << "Can not open the grades file " << argv[1] << "\n"; return 1; }

std::vector scores; // Vector to hold the input scores; initially empty. int x; // Input variable

// Read the scores, appending each to the end of the vector while ( grades_str >> x ) { scores.push_back(x); }

// Quit with an error message if too few scores. if ( scores.size() == 0 ) { std::cout << "No scores entered. Please try again!" << std::endl; return 1; }

// Compute and output the average value. int sum=0; // Accumulation of the values for ( unsigned int i = 0; i < scores.size(); ++ i ) { sum += scores[i]; }

double average = double(sum) / scores.size(); std::cout << "The average of " << scores.size() << " grades is " << std::setprecision(3) << average << std::endl;

// Exercise: compute and output the standard deviation.

return 0; }

4.7 Median

  • Intuitively, a median value of a sequence is a value that is less than half of the values in the sequence, and

greater than half of the values in the sequence.

  • More technically, if a 0 , a 1 , a 2 ,... , an− 1 is a sequence of n values AND if the sequence is sorted such that

a 0 ≤ a 1 ≤ a 2 ≤ · · · ≤ an− 1 then the median is

a(n−1)/ 2 if n is odd

an/ 2 − 1 + an/ 2

2 if^ n^ is even

  • Sorting is therefore the key to finding the median.

4.8 Standard Library Sort Function

  • The standard library has a series of algorithms built to apply to container classes.
  • The prototypes for these algorithms (actually the functions implementing these algorithms) are in header file

algorithm.

  • One of the most important of the algorithms is sort.
  • It is accessed by providing the beginning and end of the container’s interval to sort.
  • As an example, the following code reads, sorts and outputs a vector of doubles

double x;

std::vector a;

while ( std::cin >> x ) a.push_back(x);

std::sort( a.begin(), a.end() );

for ( unsigned int i=0; i<a.size(); ++i )

std::cout << a[i] << ’\n’;

  • a.begin() is an iterator referencing the first location in the vector, while a.end() is an iterator referencing

one past the last location in the vector.

  • We will learn much more about iterators in the next few weeks.
  • Every container has iterators: strings have begin() and end() iterators defined on them.
  • The ordering of values by std::sort is least to greatest (technically, non-decreasing). We will see ways to

change this.

std::vector scores; // Vector to hold the input scores; initially empty.

// Read the scores, as before read_scores( scores, grades_str );

// Quit with an error message if too few scores. if ( scores.size() == 0 ) { std::cout << "No scores entered. Please try again!" << std::endl; return 1; }

// Compute the average, standard deviation and median double average, std_dev; compute_avg_and_std_dev( scores, average, std_dev ); double median = compute_median( scores );

// Output std::cout << "Among " << scores.size() << " grades: \n" << " average = " << std::setprecision(3) << average << ’\n’ << " std_dev = " << std_dev << ’\n’ << " median = " << median << std::endl;

return 0; }

4.10 Passing Vectors (and Strings) As Parameters

The following outlines rules for passing vectors as parameters. The same rules apply to passing strings.

  • If you are passing a vector as a parameter to a function and you want to make a (permanent) change to the

vector, then you should pass it by reference.

  • This is illustrated by the function read scores in the program median grade.
  • This is very different from the behavior of arrays as parameters.
  • What if you don’t want to make changes to the vector or don’t want these changes to be permanent?
  • The answer we’ve learned so far is to pass by value.
  • The problem is that the entire vector is copied when this happens!
  • The solution is to pass by constant reference: pass it by reference, but make it a constant so that it can not

be changed.

  • This is illustrated by the functions compute avg and std dev and compute median in the program median grade.
  • As a general rule, you should not pass a container object such as a vector or a string, by value because of the

cost of copying. There are rare circumstances in which this rule may be violated, but not in CS II.

4.11 Initializing a Vector — The Use of Constructors

Here are several different ways to initialize a vector:

  • This “constructs” an empty vector of integers. Values must be placed in the vector using push_back.

vector a;

  • This constructs a vector of 100 doubles, each entry storing the value 3.14. New entries can be created using

push_back, but these will create entries 100, 101, 102, etc.

int n = 100;

vector b( 100, 3.14 );

  • This constructs a vector of 10,000 ints, but provides no initial values for these integers. Again, new entries can

be created for the vector using push_back. These will create entries 10000, 10001, etc.

vector c( n*n );

  • This constructs a vector that is an exact copy of vector b.

vector d( b );

  • This is a compiler error because no constructor exists to create an int vector from a double vector. These are

different types.

vector e( b );

4.12 Exercises

1. After the above code constructing the three vectors, what will be output by the following statement?

cout << a.size() << endl

<< b.size() << endl

<< c.size() << endl;

2. Write code to construct a vector containing 100 doubles, each having the value 55.5.

3. Write code to construct a vector containing 1000 doubles, containing the values 0, 1,

5, etc.

Write it two ways, one that uses push_back and one that does not use push_back.

std::cout << ’[’ << std::setw(3) << lower << ".." << std::setw(3) << upper << "): " << std::setw(3) << histogram[b] << ’\n’; }

return 0; // Everything ok }

4.14 Example: Alphabetize Strings

// Program: alphabetize.cpp // Author: Chuck Stewart // Purpose: Demonstrate using a vector of strings and sorting.

#include #include #include #include #include

int main( int argc, char* argv[] ) {

if ( argc != 3 ) { std::cerr << "Usage: " << argv[0] << " names-in names-out\n"; return 1; } std::ifstream names_in_str( argv[1] ); if ( !names_in_str ) { std::cerr << "Can not open the names file " << argv[1] << "\n"; return 1; } std::ofstream names_out_str( argv[2] ); if ( !names_out_str ) { std::cerr << "Can not open the output names file " << argv[2] << "\n"; return 1; }

std::vector<std::string> names; std::string one_name;

// Read the strings one at a time and add them to the back of the // vector. The reading loop ends when the end of file has been reached. while ( names_in_str >> one_name ) { names.push_back( one_name ); }

// Sort the vector of strings in the same manner that we sorted the // vector of doubles. The sort function uses (automatically) the < // operator which is defined on strings. This operator compares // strings "lexicographically". std::sort( names.begin(), names.end() );

names_out_str << "\n" << "Here are the names in alphabetical order." << std::endl; for ( unsigned int i=0; i<names.size(); ++i ) { names_out_str << names[i] << std::endl; }

return 0; }