CSE332S Object-Oriented Programming in C++ (Lecture 10)
Associative Containers
| Container | Sorted | Unique Key | Allow duplicates |
|---|---|---|---|
set | Yes | Yes | No |
multiset | Yes | Yes | Yes |
unordered_set | No | Yes | No |
unordered_multiset | No | Yes | Yes |
map | Yes | Yes | No |
multimap | Yes | Yes | Yes |
unordered_map | No | Yes | No |
unordered_multimap | No | Yes | Yes |
Associative containers support efficient key lookup vs. sequence containers, which lookup by position Associative containers differ in 3 design dimensions
- Ordered vs. unordered (tree vs. hash structured)
- We’ll look at ordered containers today, unordered next time
- Set vs. map (just the key or the key and a mapped type)
- Unique vs. multiple instances of a key
Ordered Associative Containers
Example: set, multiset, map, multimap
Ordered associative containers are tree structured
- Insert/delete maintain sorted order, e.g.
operator< - Don’t use sequence algorithms like
sortorfindwith them- Already sorted, so sorting unnecessary (or harmful)
findis more efficient (logarithmic time) as a container method Ordered associative containers are bidirectional
- Can iterate through them in either direction, find sub-ranges
- Can use as source or destination for algorithms like
copy
Set vs. Map
A set/multiset stores keys (the key is the entire value)
- Used to collect single-level information (e.g., a set of words to ignore)
- Avoid in-place modification of keys (especially in a set or multiset)
A map/multimap associates keys with mapped types
- That style of data structure is sometimes called an associative array
- Map subscripting operator takes key, returns reference to mapped type
- E.g.,
string s = employees[id]; // returns employee name - If key does not exist,
[]creates new entry with the key, value-initialized (0 if numeric, default initialized if class) instance of the mapped type
Unique vs. Multiple Instances of a Key
In set and map containers, keys are unique
- In set, keys are the entire value, so every element is unique
- In map, multiple keys may map to same value, but can’t duplicate keys
- Attempt to insert a duplicate key is ignored by the container (returns false)
In multiset and multimap containers, duplicate keys ok
- Since containers are ordered, duplicates are kept next to each other
- Insertion will always succeed, at appropriate place in the order
Key Types, Comparators, Strict Weak Ordering
Like sort algorithm, can modify container’s order …
… with any callable object that can be used correctly for sort
Must establish a strict weak ordering over elements
- Two keys cannot both be less than each other (inequality), so comparison operator must return
falseif they are equal - If
a < bandb < cthena < c(transitivity of inequality) - If
!(a < b)and! (b < a)thena == b(equivalence) - If
a == bandb == cthena == c(transitivity of eqivalence)
Sounds like definition of order in math
Type of the callable object is used in container type
- Cool example in LLM pp. 426 using
decltypefor a function- Could do this by declaring your own pointer to function type
- But much easier to let compiler’s type inference figure it out for you
Pairs
Maps use pair template to hold key, mapped type
- A
paircan be used hold any two types - Maps use the key type as the 1st element of the pair (
p.first) - Maps use the mapped type as the 2nd element of the pair (
p.second)
Can compare pair variables using operators
- Equivalence, less than, other relational operators
Can declare pair variables several different ways
- Easiest uses initialization list (curly braces around values) (e.g.
pair<string, int> p = {"hello", 1};) - Can also default construct (value initialization) (e.g.
pair<string, int> p;) - Can also construct with two values (e.g.
pair<string, int> p("hello", 1);) - Can also use special
make_pairfunction (e.g.pair<string, int> p = make_pair("hello", 1);)
Unordered Containers (UCs)
Example: unordered_set, unordered_multiset, unordered_map, unordered_multimap
UCs use == to compare elements instead of < to order them
- Types in unordered containers must be equality comparable
- When you write your own structs, overload
==as well as<
UCs store elements in indexed buckets instead of in a tree
- Useful for types that don’t have an obvious ordering relation over their values
UCs use hash functions to put and find elements in buckets
- May improve performance in some cases (if performance profiling suggests so)
- Declare UCs with pluggable hash functions via callable objects, decltype, etc.
- Or specialize the
std::hash()template for your type, used by default
Summary
Use associative containers for key based lookup
- Ordering of elements is maintained over the keys
- Think ranges and ordering rather than position indexes
- A sorted vector may be a better alternative (depends on which operations you will use most often, and their costs)
Ordered associative containers use strict weak order
- Operator
<or any callable object that acts like<overintcan be used
Maps allow two-level (dictionary-like) lookup
- Vs. sets which are used for “there or not there” lookup
- Map uses a
pairto associate key with mapped type
Can enforce uniqueness or allow duplicates
- Duplicates are still stored in order, creating “equal ranges”
IO Libraries
std::copy()
std::copy()
http://www.cplusplus.com/reference/algorithm/copy/
Takes 3 parameters:
copy(InputIterator first, InputIterator last, OutputIterator result);[first, last)specifies the range of elements to copy.resultspecifies where we are copying to.
Example:
vector<int> v = {1, 2, 3, 4, 5};
// copy v to cout
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));Some useful destination iterator types:
ostream_iterator- iterator over an output stream (likecout)insert_iterator- inserts elements directly into an STL container (will be practiced in studio)
#include <iostream>
#include <string>
#include <fstream>
#include <iterator>
#include <algorithm>
using namespace std;
int main(int argc, char *argv[]) {
if (argc != 3) {
cerr << "Usage: " << argv[0] << " <input file> <output file>" << endl;
return 1;
}
string input_file = argv[1];
string output_file = argv[2];
ifstream input_file(input_file.c_str());
ofstream output_file(output_file.c_str());
// don't skip whitespace
input_file >> noskipws;
istream_iterator<char> i (input_file);
ostream_iterator<char> o (output_file);
// copy the input file to the output file: copy(InputIterator first, InputIterator last, OutputIterator result);
copy(i, istream_iterator<char>(), o);
cout << "Copied input file" << input_file << " to " << output_file << endl;
return 0;
}IO reviews
How to move data into and out of a program:
- Using
argcandargvto pass command line args - Using
coutto print data out to the terminal - Using
cinto obtain data from the user at run-time - Using an
ifstreamto read data in from a file - Using an
ofstreamto write data out to a file
How to move data between strings, basic types
- Using an
istringstreamto extract formatted int values - Using an
ostringstreamto assemble a string
Streams
Simply a buffer of data (array of bytes).
Insertion operator (<<) specifies how to move data from a variable into an output stream
Extraction operator (>>) specifies how to pull data off of an input stream and store it into a variable
Both operators defined for built-in types:
- Numeric types
- Pointers
- Pointers to char (char *)
Cannot copy or assign stream objects
- Copy construction or assignment syntax using them results in a compile-time error
Extraction operator consumes data from input stream
- “Destructive read” that reads a different element each time
- Use a variable if you want to read same value repeatedly
Need to test streams’ condition states
- E.g., calling the
is_openmethod on a file stream - E.g., use the stream object in a while or if test
- Insertion and extraction operators return a reference to a stream object, so can test them too
File stream destructor calls close automatically
Flushing and stream manipulators
An output stream may hold onto data for a while, internally
- E.g., writing chunks of text rather than a character at a time is efficient
- When it writes data out (e.g., to a file, the terminal window, etc.) is entirely up to the stream, unless you tell it to flush out its buffers
- If a program crashes, any un-flushed stream data is lost
- So, flushing streams reasonably often is an excellent debugging trick
Can tie an input stream directly to an output stream
- Output stream is then flushed by call to input stream extraction operator
- E.g.,
my_istream.tie(&my_ostream); coutis already tied tocin(useful for prompting the user, getting input)
Also can flush streams directly using stream manipulators
- E.g.,
cout << flush;orcout << endl;orcout << unitbuf;
Other stream manipulators are useful for formatting streams
- Field layout:
setwidth,setprecision, etc. - Display notation:
oct,hex,dec,boolalpha,nobooleanalpha,scientific, etc.