blog/pb071/pexam/b-garbage_collect.md
Matej Focko f801e4bb16
chore: add ‹last_update› to docs-like posts
Allow showing ‹last updated› on the pages and also rename the files
where necessary, so they are shown in the order they were created.

Signed-off-by: Matej Focko <mfocko@redhat.com>
2023-08-15 17:27:16 +02:00

169 lines
No EOL
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
slug: garbage_collect
title: Practice exam B
description: |
Garbage everywhere…
last_update:
date: 2023-05-08
---
# Garbage Collection
:::caution Exam environment
* During the exam you will be provided with a barebone _exam session_ on the
_faculty computers_.
* In browser you are only allowed to have the following tabs open:
* [C documentation](https://en.cppreference.com)
* page containing the assignment
* You **are not** allowed to use your own source code, e.g. prepared beforehand
or from the seminars.
* You have **5 minutes** to read through the assignment and ask any follow-up
questions should be there something unclear.
* You have **60 minutes** to work on the assignment, afterward your work will be
discussed with your seminar tutor.
:::
You have gotten into a trouble during your regular upgrade of your archLinux[^1]
installation… You've been carelessly running the upgrades for months and forgot
about clearing up the caches.
Your task is to write a program `garbage_collect` that will evaluate the shell
history provided as a file and will try to find files or directories that are
suspiciously big and decide which of them should be deleted to free some space.
## Format of the shell history
You are provided one file consisting of the captured buffer of the terminal. You
can see only two commands being used:
1. `cd somewhere` that changes the current working directory.
At the beginning you start in the root of the filesystem (i.e. `/`).
You are **guaranteed** that `somewhere` is:
* `.` that is a current working directory (i.e. does nothing),
* `..` that moves you up one level (in case you are in `/`, does nothing), or
* is a valid directory in the current working directory.
:::caution
There are no guarantees or restrictions on the names of the files or
directories!
:::
1. `ls` that will list files in the current working directory and their
respective sizes. If there is a directory in the current working it has `dir`
instead of the size.
```
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ cd .
$ cd .
$ cd .
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
```
For this input, you will get following file system:
```
- / (dir, size=48381165)
- a (dir, size=94853)
- e (dir, size=584)
- i (file, size=584)
- f (file, size=29116)
- g (file, size=2557)
- h.lst (file, size=62596)
- b.txt (file, size=14848514)
- c.dat (file, size=8504156)
- d (dir, size=24933642)
- j (file, size=4060174)
- d.log (file, size=8033020)
- d.ext (file, size=5626152)
- k (file, size=7214296)
```
## Format of the output
Your program should support 2 switches:
* `-gt min_size` that will print out suspiciously big files.
* `-f total_size min_unused` that will print out a file to be deleted.
### `-gt min_size`
With this switch you are provided one additional argument:
* `min_size` that is the lower bound (inclusive) for size of any file or
directory that is supposed to be listed.
When your program is being run with this switch, it is is supposed to print out
all files **and** directories that are bigger than the provided `min_size`.
### `-f total_size min_unused`
With this switch you are provided two additional arguments:
* `total_size` that is a total size of the filesystem[^2].
* `min_unused` that is a minimum of free space required for an upgrade.
Your program should find **exactly one** file or a directory that is of the
smallest size, but big enough to free enough space for the upgrade to proceed.
In other words, if that file or directory is deleted, following should hold:
$$
\mathtt{total\_size} - \mathtt{used} \geq \mathtt{min\_unused}
$$
## Example usage
You can have a look at the example usage of your program. We can run your
program from the shell like
$ ./garbage_collect shell_history.txt -gt 10000000
24933642 /d
14848514 /b.txt
48381165 /
$ ./garbage_collect shell_history.txt -f 70000000 30000000
24933642 /d
## Requirements and notes
* Define **structures** (and **enumerations**, if applicable) for the parsed
information from the files.
* For keeping the “records”, use some **dynamic** data structure.
* Don't forget to consider pros and cons of using _specific_ data structures
before going through implementing.
* You **are not required** to produce 1:1 output to the provided examples, they
are just a hint to not waste your time tinkering with a user experience.
* If any of the operations on the input files should fail,
**you are expected to** handle the situation _accordingly_.
* Failures of any other common functions (e.g. functions used for memory
management) should be handled in **the same way** as they were in the
homeworks and seminars.
* Your program **must free** all the resources before exiting.
[^1]: Also applies to Fedora, but… we use arch btw :wink:
[^2]: duh!