CSE332S Object-Oriented Programming in C++ (Lecture 10)

Associative Containers

Container	Sorted	Unique Key	Allow duplicates
`set`	Yes	Yes	No
`multiset`	Yes	Yes	Yes
`unordered_set`	No	Yes	No
`unordered_multiset`	No	Yes	Yes
`map`	Yes	Yes	No
`multimap`	Yes	Yes	Yes
`unordered_map`	No	Yes	No
`unordered_multimap`	No	Yes	Yes

Associative containers support efficient key lookup vs. sequence containers, which lookup by position Associative containers differ in 3 design dimensions

Ordered vs. unordered (tree vs. hash structured)
We’ll look at ordered containers today, unordered next time
Set vs. map (just the key or the key and a mapped type)
Unique vs. multiple instances of a key

Ordered Associative Containers

Example: set, multiset, map, multimap

Ordered associative containers are tree structured

Insert/delete maintain sorted order, e.g. operator<
Don’t use sequence algorithms like sort or find with them
- Already sorted, so sorting unnecessary (or harmful)
- find is more efficient (logarithmic time) as a container method Ordered associative containers are bidirectional
Can iterate through them in either direction, find sub-ranges
Can use as source or destination for algorithms like copy

Set vs. Map

A set/multiset stores keys (the key is the entire value)

Used to collect single-level information (e.g., a set of words to ignore)
Avoid in-place modification of keys (especially in a set or multiset)

A map/multimap associates keys with mapped types

That style of data structure is sometimes called an associative array
Map subscripting operator takes key, returns reference to mapped type
E.g., string s = employees[id]; // returns employee name
If key does not exist, [] creates new entry with the key, value-initialized (0 if numeric, default initialized if class) instance of the mapped type

Unique vs. Multiple Instances of a Key

In set and map containers, keys are unique

In set, keys are the entire value, so every element is unique
In map, multiple keys may map to same value, but can’t duplicate keys
Attempt to insert a duplicate key is ignored by the container (returns false)

In multiset and multimap containers, duplicate keys ok

Since containers are ordered, duplicates are kept next to each other
Insertion will always succeed, at appropriate place in the order

Key Types, Comparators, Strict Weak Ordering

Like sort algorithm, can modify container’s order … … with any callable object that can be used correctly for sort

Must establish a strict weak ordering over elements

Two keys cannot both be less than each other (inequality), so comparison operator must return false if they are equal
If a < b and b < c then a < c (transitivity of inequality)
If !(a < b) and ! (b < a) then a == b (equivalence)
If a == b and b == c then a == c (transitivity of eqivalence)

Sounds like definition of order in math

Type of the callable object is used in container type

Cool example in LLM pp. 426 using decltype for a function
- Could do this by declaring your own pointer to function type
- But much easier to let compiler’s type inference figure it out for you

Pairs

Maps use pair template to hold key, mapped type

A pair can be used hold any two types
Maps use the key type as the 1st element of the pair (p.first)
Maps use the mapped type as the 2nd element of the pair (p.second)

Can compare pair variables using operators

Equivalence, less than, other relational operators

Can declare pair variables several different ways

Easiest uses initialization list (curly braces around values) (e.g. pair<string, int> p = {"hello", 1};)
Can also default construct (value initialization) (e.g. pair<string, int> p;)
Can also construct with two values (e.g. pair<string, int> p("hello", 1);)
Can also use special make_pair function (e.g. pair<string, int> p = make_pair("hello", 1);)

Unordered Containers (UCs)

Example: unordered_set, unordered_multiset, unordered_map, unordered_multimap

UCs use == to compare elements instead of < to order them

Types in unordered containers must be equality comparable
When you write your own structs, overload == as well as <

UCs store elements in indexed buckets instead of in a tree

Useful for types that don’t have an obvious ordering relation over their values

UCs use hash functions to put and find elements in buckets

May improve performance in some cases (if performance profiling suggests so)
Declare UCs with pluggable hash functions via callable objects, decltype, etc.
Or specialize the std::hash() template for your type, used by default

Summary

Use associative containers for key based lookup

Ordering of elements is maintained over the keys
Think ranges and ordering rather than position indexes
A sorted vector may be a better alternative (depends on which operations you will use most often, and their costs)

Ordered associative containers use strict weak order

Operator < or any callable object that acts like < over int can be used

Maps allow two-level (dictionary-like) lookup

Vs. sets which are used for “there or not there” lookup
Map uses a pair to associate key with mapped type

Can enforce uniqueness or allow duplicates

Duplicates are still stored in order, creating “equal ranges”

IO Libraries

`std::copy()`

std::copy()

http://www.cplusplus.com/reference/algorithm/copy/

Takes 3 parameters:

copy(InputIterator first, InputIterator last, OutputIterator result);
- [first, last) specifies the range of elements to copy.
- result specifies where we are copying to.

Example:


vector<int> v = {1, 2, 3, 4, 5};
// copy v to cout
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));

Some useful destination iterator types:

ostream_iterator - iterator over an output stream (like cout)
insert_iterator - inserts elements directly into an STL container (will be practiced in studio)


#include <iostream>
#include <string>
#include <fstream>
#include <iterator>
#include <algorithm>
 
using namespace std;
 
int main(int argc, char *argv[]) {
    if (argc != 3) {
        cerr << "Usage: " << argv[0] << " <input file> <output file>" << endl;
        return 1;
    }
 
    string input_file = argv[1];
    string output_file = argv[2];
 
    ifstream input_file(input_file.c_str());
    ofstream output_file(output_file.c_str());
 
    // don't skip whitespace
    input_file >> noskipws;
 
    istream_iterator<char> i (input_file);
    ostream_iterator<char> o (output_file);
 
    // copy the input file to the output file: copy(InputIterator first, InputIterator last, OutputIterator result);
    copy(i, istream_iterator<char>(), o);
 
    cout << "Copied input file" << input_file << " to " << output_file << endl;
 
    return 0;
}

IO reviews

How to move data into and out of a program:

Using argc and argv to pass command line args
Using cout to print data out to the terminal
Using cin to obtain data from the user at run-time
Using an ifstream to read data in from a file
Using an ofstream to write data out to a file

How to move data between strings, basic types

Using an istringstream to extract formatted int values
Using an ostringstream to assemble a string

Streams

Simply a buffer of data (array of bytes).

Insertion operator (<<) specifies how to move data from a variable into an output stream Extraction operator (>>) specifies how to pull data off of an input stream and store it into a variable

Both operators defined for built-in types:

Numeric types
Pointers
Pointers to char (char *)

Cannot copy or assign stream objects

Copy construction or assignment syntax using them results in a compile-time error

Extraction operator consumes data from input stream

“Destructive read” that reads a different element each time
Use a variable if you want to read same value repeatedly

Need to test streams’ condition states

E.g., calling the is_open method on a file stream
E.g., use the stream object in a while or if test
Insertion and extraction operators return a reference to a stream object, so can test them too

File stream destructor calls close automatically

Flushing and stream manipulators

An output stream may hold onto data for a while, internally

E.g., writing chunks of text rather than a character at a time is efficient
When it writes data out (e.g., to a file, the terminal window, etc.) is entirely up to the stream, unless you tell it to flush out its buffers
If a program crashes, any un-flushed stream data is lost
So, flushing streams reasonably often is an excellent debugging trick

Can tie an input stream directly to an output stream

Output stream is then flushed by call to input stream extraction operator
E.g., my_istream.tie(&my_ostream);
cout is already tied to cin (useful for prompting the user, getting input)

Also can flush streams directly using stream manipulators

E.g., cout << flush; or cout << endl; or cout << unitbuf;

Other stream manipulators are useful for formatting streams

Field layout: setwidth, setprecision, etc.
Display notation: oct, hex, dec, boolalpha, nobooleanalpha, scientific, etc.