As part of my self-study in scripting and Linux, I worked on a challenge designed to test my Bash skills. A prompt asked me to create a script that could scan a directory, identify sensitive or large files, and archive the results with clear logging and output.

I didn’t come up with the idea, but I did write all the Bash code myself, structured the logic, tested the behavior, and refined it until it worked reliably.

Below is me using my script.


What the Script Does

The script (audit_backup.sh) combines a few core tasks:

1. Find the Largest Files

find "$target_dir" -type f -exec du -h {} + | sort -rh | head -5 > logs/largest_files.txt

This scans the target directory, lists file sizes, sorts them in human-readable format, and logs the top 5 largest files.

2. Identify Potentially Sensitive Files

find "$target_dir" \( -size +1M -o -name "*.log" -o -name "*.key" \) | grep -vi 'debug' > logs/findings.txt

This flags files over 1MB or with .log / .key extensions, then excludes any paths containing “debug” (case-insensitive) to keep noisy debug logs out of the results. The filtered list is saved to logs/findings.txt.

3. Create a Clean Archive

# Default (preserves full path)
tar --exclude='*.bin' -czf "archive/$(basename "$target_dir")-$timestamp.tar.gz" "$target_dir"

# Alternative (flatter archive, no full path)  
# tar --exclude='*.bin' -czf "archive/$(basename "$target_dir")-$timestamp.tar.gz" \
#   -C "$(dirname "$target_dir")" "$(basename "$target_dir")"

This compresses the directory into a timestamped archive while excluding .bin files. A commented-out alternative is also included, letting you choose whether to preserve the original directory path or not.

below is me using my script with the alternative on a different directory

4. Log Disk Usage Summary

{
    echo "Disk usage of target folder:"
    du -sh "$target_dir"
    echo
    echo "Available disk space on system:"
    df -h /
} > logs/summary.txt

This logs both the folder’s total size and the system’s available disk space, giving quick context for storage health.


Purpose of the Exercise

This project was about applying the Bash skills I had been learning:

  • Writing modular, readable scripts
  • Using core commands effectively (find, du, grep, tar, df)
  • Building automation with proper logging
  • Thinking practically about how simple scripts can help with audits and backups

What I Learned

  • How to combine arguments, variables, and pipes to make Bash scripts more powerful.
  • The importance of filtering output with tools like grep -v to ignore noise (such as DEBUG files).
  • Why including options (like the two tar methods) makes scripts more flexible.
  • That even small automation scripts can make system administration tasks faster and more reliable.

Final Thoughts

Through this project I sharpened my skills on arguments, variables, and pipes—essentials in automation—and strengthened both my confidence and competence in Bash.