2023-05-08 00:14:08 +02:00
|
|
|
|
---
|
|
|
|
|
slug: garbage_collect
|
|
|
|
|
title: Practice exam B
|
|
|
|
|
description: |
|
|
|
|
|
Garbage everywhere…
|
2023-08-15 17:27:16 +02:00
|
|
|
|
last_update:
|
|
|
|
|
date: 2023-05-08
|
2023-05-08 00:14:08 +02:00
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Garbage Collection
|
|
|
|
|
|
2024-07-11 23:50:54 +02:00
|
|
|
|
:::warning Exam environment
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- During the exam you will be provided with a barebone _exam session_ on the
|
2023-05-08 00:14:08 +02:00
|
|
|
|
_faculty computers_.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- In browser you are only allowed to have the following tabs open:
|
|
|
|
|
- [C documentation](https://en.cppreference.com)
|
|
|
|
|
- page containing the assignment
|
|
|
|
|
- You **are not** allowed to use your own source code, e.g. prepared beforehand
|
2023-05-08 00:14:08 +02:00
|
|
|
|
or from the seminars.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- You have **5 minutes** to read through the assignment and ask any follow-up
|
2023-05-08 00:14:08 +02:00
|
|
|
|
questions should be there something unclear.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- You have **60 minutes** to work on the assignment, afterward your work will be
|
2023-05-08 00:14:08 +02:00
|
|
|
|
discussed with your seminar tutor.
|
|
|
|
|
|
|
|
|
|
:::
|
|
|
|
|
|
|
|
|
|
You have gotten into a trouble during your regular upgrade of your archLinux[^1]
|
|
|
|
|
installation… You've been carelessly running the upgrades for months and forgot
|
|
|
|
|
about clearing up the caches.
|
|
|
|
|
|
|
|
|
|
Your task is to write a program `garbage_collect` that will evaluate the shell
|
|
|
|
|
history provided as a file and will try to find files or directories that are
|
|
|
|
|
suspiciously big and decide which of them should be deleted to free some space.
|
|
|
|
|
|
|
|
|
|
## Format of the shell history
|
|
|
|
|
|
|
|
|
|
You are provided one file consisting of the captured buffer of the terminal. You
|
|
|
|
|
can see only two commands being used:
|
|
|
|
|
|
|
|
|
|
1. `cd ‹somewhere›` that changes the current working directory.
|
|
|
|
|
|
|
|
|
|
At the beginning you start in the root of the filesystem (i.e. `/`).
|
2023-11-24 16:00:45 +01:00
|
|
|
|
|
2023-05-08 00:14:08 +02:00
|
|
|
|
You are **guaranteed** that `‹somewhere›` is:
|
2023-11-24 16:00:45 +01:00
|
|
|
|
|
|
|
|
|
- `.` that is a current working directory (i.e. does nothing),
|
|
|
|
|
- `..` that moves you up one level (in case you are in `/`, does nothing), or
|
|
|
|
|
- is a valid directory in the current working directory.
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
2024-07-11 23:50:54 +02:00
|
|
|
|
:::warning[caution]
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
|
|
|
|
There are no guarantees or restrictions on the names of the files or
|
|
|
|
|
directories!
|
|
|
|
|
|
|
|
|
|
:::
|
|
|
|
|
|
|
|
|
|
1. `ls` that will list files in the current working directory and their
|
|
|
|
|
respective sizes. If there is a directory in the current working it has `dir`
|
|
|
|
|
instead of the size.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ ls
|
|
|
|
|
dir a
|
|
|
|
|
14848514 b.txt
|
|
|
|
|
8504156 c.dat
|
|
|
|
|
dir d
|
|
|
|
|
$ cd a
|
|
|
|
|
$ cd .
|
|
|
|
|
$ cd .
|
|
|
|
|
$ cd .
|
|
|
|
|
$ ls
|
|
|
|
|
dir e
|
|
|
|
|
29116 f
|
|
|
|
|
2557 g
|
|
|
|
|
62596 h.lst
|
|
|
|
|
$ cd e
|
|
|
|
|
$ ls
|
|
|
|
|
584 i
|
|
|
|
|
$ cd ..
|
|
|
|
|
$ cd ..
|
|
|
|
|
$ cd d
|
|
|
|
|
$ ls
|
|
|
|
|
4060174 j
|
|
|
|
|
8033020 d.log
|
|
|
|
|
5626152 d.ext
|
|
|
|
|
7214296 k
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
For this input, you will get following file system:
|
2023-11-24 16:00:45 +01:00
|
|
|
|
|
2023-05-08 00:14:08 +02:00
|
|
|
|
```
|
|
|
|
|
- / (dir, size=48381165)
|
|
|
|
|
- a (dir, size=94853)
|
|
|
|
|
- e (dir, size=584)
|
|
|
|
|
- i (file, size=584)
|
|
|
|
|
- f (file, size=29116)
|
|
|
|
|
- g (file, size=2557)
|
|
|
|
|
- h.lst (file, size=62596)
|
|
|
|
|
- b.txt (file, size=14848514)
|
|
|
|
|
- c.dat (file, size=8504156)
|
|
|
|
|
- d (dir, size=24933642)
|
|
|
|
|
- j (file, size=4060174)
|
|
|
|
|
- d.log (file, size=8033020)
|
|
|
|
|
- d.ext (file, size=5626152)
|
|
|
|
|
- k (file, size=7214296)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Format of the output
|
|
|
|
|
|
|
|
|
|
Your program should support 2 switches:
|
|
|
|
|
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- `-gt ‹min_size›` that will print out suspiciously big files.
|
|
|
|
|
- `-f ‹total_size› ‹min_unused›` that will print out a file to be deleted.
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
|
|
|
|
### `-gt ‹min_size›`
|
|
|
|
|
|
|
|
|
|
With this switch you are provided one additional argument:
|
|
|
|
|
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- `min_size` that is the lower bound (inclusive) for size of any file or
|
2023-05-08 00:14:08 +02:00
|
|
|
|
directory that is supposed to be listed.
|
|
|
|
|
|
|
|
|
|
When your program is being run with this switch, it is is supposed to print out
|
|
|
|
|
all files **and** directories that are bigger than the provided `min_size`.
|
|
|
|
|
|
|
|
|
|
### `-f ‹total_size› ‹min_unused›`
|
|
|
|
|
|
|
|
|
|
With this switch you are provided two additional arguments:
|
|
|
|
|
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- `total_size` that is a total size of the filesystem[^2].
|
|
|
|
|
- `min_unused` that is a minimum of free space required for an upgrade.
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
|
|
|
|
Your program should find **exactly one** file or a directory that is of the
|
|
|
|
|
smallest size, but big enough to free enough space for the upgrade to proceed.
|
|
|
|
|
|
|
|
|
|
In other words, if that file or directory is deleted, following should hold:
|
2023-11-24 16:00:45 +01:00
|
|
|
|
|
2023-05-08 00:14:08 +02:00
|
|
|
|
$$
|
|
|
|
|
\mathtt{total\_size} - \mathtt{used} \geq \mathtt{min\_unused}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
## Example usage
|
|
|
|
|
|
|
|
|
|
You can have a look at the example usage of your program. We can run your
|
|
|
|
|
program from the shell like
|
|
|
|
|
|
2024-07-11 23:50:54 +02:00
|
|
|
|
```
|
|
|
|
|
$ ./garbage_collect shell_history.txt -gt 10000000
|
|
|
|
|
24933642 /d
|
|
|
|
|
14848514 /b.txt
|
|
|
|
|
48381165 /
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
2024-07-11 23:50:54 +02:00
|
|
|
|
$ ./garbage_collect shell_history.txt -f 70000000 30000000
|
|
|
|
|
24933642 /d
|
|
|
|
|
```
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
|
|
|
|
## Requirements and notes
|
|
|
|
|
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- Define **structures** (and **enumerations**, if applicable) for the parsed
|
2023-05-08 00:14:08 +02:00
|
|
|
|
information from the files.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- For keeping the “records”, use some **dynamic** data structure.
|
|
|
|
|
- Don't forget to consider pros and cons of using _specific_ data structures
|
2023-05-08 00:14:08 +02:00
|
|
|
|
before going through implementing.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- You **are not required** to produce 1:1 output to the provided examples, they
|
2023-05-08 00:14:08 +02:00
|
|
|
|
are just a hint to not waste your time tinkering with a user experience.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- If any of the operations on the input files should fail,
|
2023-05-08 00:14:08 +02:00
|
|
|
|
**you are expected to** handle the situation _accordingly_.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- Failures of any other common functions (e.g. functions used for memory
|
2023-05-08 00:14:08 +02:00
|
|
|
|
management) should be handled in **the same way** as they were in the
|
|
|
|
|
homeworks and seminars.
|
2023-11-24 16:00:45 +01:00
|
|
|
|
- Your program **must free** all the resources before exiting.
|
2023-05-08 00:14:08 +02:00
|
|
|
|
|
|
|
|
|
[^1]: Also applies to Fedora, but… we use arch btw :wink:
|
2023-11-24 16:00:45 +01:00
|
|
|
|
[^2]: duh!
|