Interview › Scripting (Bash, Groovy)
Why must input be sorted before uniq, and what does uniq -c do? [Advanced]
Answer
uniq compares only neighboring lines, so input must be sorted if I want global duplicate counts. uniq -c prefixes each group with the number of repeated adjacent lines.
Technical explanation
Without sorting, the same value appearing in separate parts of a file will be counted as multiple groups.
The typical frequency pipeline is sort | uniq -c | sort -rn.
If preserving original order matters, use awk with a map instead of sorting.
Hands-on example
printf '%s\n' b a b a a | uniq -c
# counts only adjacent duplicates
printf '%s\n' b a b a a | sort | uniq -c | sort -rn
# global frequency counts
Preparing for an interview?
Check how well your resume matches the role with our free resume checker— match score, ATS check, and the skills you're missing.
More Scripting (Bash, Groovy) interview questions
- What is the purpose of the shebang line, and what does #!/bin/bash do? [Basic]
- What is the difference between sh and bash? [Basic]
- How do you make a script executable and run it? [Basic]
- What is the difference between running a script with ./script.sh, bash script.sh, and source script.sh? [Basic]
- What does sourcing a script do differently from executing it? [Basic]
- How do you declare a variable in Bash, and why are spaces around = not allowed? [Basic]
- What is the difference between $var and ${var}? [Basic]
- What is the difference between single quotes and double quotes in Bash? [Basic]