CS144 Lab 0
Lab Checkpoint 0: networking warmup
Expected hours to complete: 2-6 hours
Over the course of this 8-part lab assignment, you’ll be building up your own implementation of a significant portion of the Internet—a router, a network interface, and the TCP protocol.
在Lab0中,我们需要实现一个使用TCP套接字获取网页的简易Web客户端和一个内存中的可靠字节流(ByteStream)。
FAQs
- What should the behavior of my program be if the caller tries to pop with a
len
greater than what is available?
We don't have any preference on the behavior if the caller tries to pop more than is buffered in the stream, as long as you behave reasonably and don't crash in that situation. If you want to pop the maximum available, that is fine with us. If you want to throw an exception, that is also fine with us.
- How fast should my code be?
Here's a reasonable rule: if any of your tests are taking longer than 2 seconds, it's probably good to keep exploring other ways of representing the ByteStream
. And if your benchmark results at the end of the test are worse than 0.1 Gbit/s (100 million bits per second), that's another sign the design may need to be rethought. This lab doesn't require any sophisticated data structures, but it might still take some trial-and-error to find a reasonable approach. Sometimes "big-O" analysis can be misleading about the performance of real programs—the constant factors can matter a lot!
Set up GNU/Linux on your computer
Install the CS144 VirtualBox virtual-machine image (instructions at https://stanford.edu/class/cs144/vm_howto/vm-howto-image.html).
需要注意的是,VirtualBox需要直接在主机操作系统上运行,比如如果你是Windows主机,那么就需要在Windows系统中安装和运行VirtualBox(不能在WSL这样的虚拟环境中运行)
Networking by hand
Fetch a Web Page
- Fetch a web page http://cs144.keithw.org/hello
1 |
|

- Fetch a web page http://cs144.keithw.org/lab0/sunetid
Send yourself an email
这部分只有连接Stanford内部网络才能完成
Listening and connecting
In this part, we play around with server and client.
Writing a network program using an OS stream socket
You’ll write a program called “webget” that creates a TCP stream socket, connects to a Web server, and fetches a page.
Set up
注意在git clone
之前可能需要先在虚拟机中配置Git代理(使用主机的VPN):
1 |
|
Modern C++
-
Use the language documentation at https://en.cppreference.com as a resource. (We’d recommend you avoid cplusplus.com which is more likely to be out-of-date.)
-
Never use malloc() or free().
-
Never use new or delete.
-
Essentially never use raw pointers (*), and use “smart” pointers (unique ptr or shared ptr) only when necessary. (You will not need to use these in CS144.)
-
Avoid templates, threads, locks, and virtual functions. (You will not need to use these in CS144.)
-
Avoid C-style
strings (char *str)
or string functions (strlen()
,strcpy()
). These are pretty error-prone. Use astd::string
instead. -
Never use C-style casts (e.g.,
(FILE *)x
). Use a C++static_cast
if you have to (you generally will not need this in CS144) -
Prefer passing function arguments by const reference (e.g.:
const Address & address
) -
Make every variable const unless it needs to be mutated.
-
Make every method const unless it needs to mutate the object.
-
Avoid global variables, and give every variable the smallest scope possible.
-
For suggestions on how to improve the code related to C++ programming practices:
1
cmake --build build --target tidy
To format the code:
1
cmake --build build --target format
Writing webget
Before writing
webget
, read over the public interfaces (the part that comes after “public:” in the filesutil/socket.hh
andutil/file descriptor.hh
).
In this part, we need to implement a simple Web client that fetches Web pages using a TCP socket provided by the operating system.
1 |
|
An in-memory reliable byte stream
In this part, we need to implement a ByteStream
object that provides reliable byte stream in memory. The Writer writes data into the stream and the Reader reads data from the stream.
What is the design of my
ByteStream
implementatino? What data structures and approach are taken?
设计这个ByteStream
的关键在于选择用什么data structure来表示内存中存储数据的stream。这个data structure需要方便从一端写入,从另一端读出,以及peek其中存储的数据。
我第一个想到的是用queue,但是queue只能读取队列首尾的元素,不方便peek。
因此可以用vector<char>
或者string,我这里选择用最直观的string。
1 |
|
这样的话Writer的push很简单,直接用string的append函数:
1 |
|
那Reader的pop如何实现呢?
最容易想到的是每次pop len
bytes的时候都将buffer_
的前len
个字符截掉:
1 |
|
但是这样的话每次pop buffer_
的内存都会重新分配,不符合pop 操作是 O(1) 的时间复杂度要求。
因此我想到用一个read_index_
来追踪当前buffer_
中读到的位置,这样每次pop只需要read_index_ += len
,只在buffer_
中已读过的bytes过多(超过buffer_
size的一半)时再重新整理:
1 |
|
这样的话buffer_
中read_index
之前的bytes是已经读过out of date的bytes,read_index
之后的bytes才是被push进stream中等待被读取的bytes。
Finish!
All tests in check0 pass!