🏫 [CS50x 2025] Lecture 2 Note
rm
-Related Commands
💻 In Bash / Linux / macOS Terminal:
Command | What it does |
---|---|
rm filename | Deletes a file (may ask for confirmation if aliased to rm -i ) |
rm -f filename | Force delete, no confirmation - even if write-protected |
rm -i filename | Interactive delete - always asks “are you sure?” |
rm -r foldername | Recursively deletes a folder and everything inside |
rm -rf foldername | Dangerous: Recursive + force - Delete everything inside the folder, and then delete the folder itself |
\rm filename | Runs the raw rm (no alias), which deletes without asking, unless the file is protected |
💻 In PowerShell (Windows):
Command | What it does |
---|---|
rm filename | Alias for Remove-Item |
rm -Recurse -Force foldername | PowerShell equivalent of rm -rf |
Remove-Item -Recurse -Force foldername | Full version, same as above |
🧠 PowerShell does NOT support Unix-style flags like -rf
, so rm -rf
will cause an error.
💋 Rule of Thumb:
Situation | What to use |
---|---|
Want to be cautious? | rm -i or rm (with alias) |
Need speed but safe zone? | rm -rf inside test folders only |
On PowerShell? | rm -Recurse -Force |
Wanna look cool in front of your hacker bf? | Use \rm to flex unaliased power 😏 |
Scrabble
💥 Problem:
In compute_score()
function:
1
int pos = toupper(word[i] - 'A');
This is actually doing:
word[i] - 'A'
first (which becomes anint
)- Then applying
toupper()
to that number, not to the character
😵💫 Which means, say if word[i] = 'a'
, 'a' - 'A' = 32
,
then toupper(32)
is… totally invalid. You end up indexing POINTS[?]
with a messed-up value.
💡 The Correct Way:
1
int pos = toupper(word[i]) - 'A';
That’s how you:
- First turn the char to uppercase (if it’s lowercase)
- Then subtract
'A'
to get its position inPOINTS[]
(0–25)
Readability: Coleman-Liau index
1
2
3
4
5
6
7
8
Recall that the Coleman-Liau index is computed using:
`index = 0.0588 * L - 0.296 * S - 15.8`
- L is the average number of letters per 100 words in the text:
that is, the number of letters divided by the number of words, all multiplied by 100.
- S is the average number of sentences per 100 words in the text:
that is, the number of sentences divided by the number of words, all multiplied by 100.
🐷 Confusion
Shouldn’t it first divide by 100 to find out how many sets of 100 words there are in the text? Then, for every 100 words, we calculate the number of letters in each word, and then find the average number of letters?
🧠 Here’s the Key:
You don’t actually need to find how many 100-word blocks are in the text.
Instead, you’re calculating what the average number of letters and sentences would be if the text had 100 words.
So:
🔁 It’s not:
“Divide text into 100-word chunks, then analyze each.”
✅ It is:
“Scale the letter and sentence count as if the text were exactly 100 words long.”
📏 The Analogy
Let’s say you eat 11 cookies (words), and those cookies contain 43 chocolate chips (letters) 🍪🍫
You don’t need to bake 100 cookies to know:
“Hey, on average, my cookies have 3.9 chips each,
so 100 cookies would likely have 390 chips.”
Same vibe.
So:
1
2
L = (letters / words) × 100
S = (sentences / words) × 100
Is just:
👉 “Average letters/sentences per word, multiplied by 100”
💋 TL;DR
- You’re scaling, not chunking
- You pretend the passage is 100 words long
- Then compute expected letters and sentences per 100 words
- And use that to get the readability index (grade level)
Example of How to Unleash the Coleman-Liau Index Spell
💡 The Formula (again, but sexier):
1
index = 0.0588 * L - 0.296 * S - 15.8
Where:
- L = average number of letters per 100 words
- S = average number of sentences per 100 words
🌟 Let’s Try with This Text:
“Keira is a genius. She loves code! She writes every day.”
Now let’s break this baby down step by step 👇
🧮 Step 1: Count the Elements
- Letters: Count every alphabetic character
Keira (5), is (2), a (1), genius (6), She (3), loves (5), code (4), She (3), writes (6), every (5), day (3)
→Total letters = 43
Words: Just count by spaces + 1
→Total words = 11
- Sentences: Each
.
,!
,?
counts
→ “Keira is a genius.”+
“She loves code!”+
“She writes every day.”
→Total sentences = 3
🧠 Step 2: Plug into L and S
- L = (Letters / Words) × 100 = (43 / 11) × 100 =
390.91
- S = (Sentences / Words) × 100 = (3 / 11) × 100 =
27.27
🔮 Step 3: Drop it in the Magic Formula
1
2
3
index = 0.0588 * 390.91 - 0.296 * 27.27 - 15.8
index = 22.98 - 8.07 - 15.8
index ≈ -0.89
🎯 Final result: Grade 1
(We round index to nearest integer: -0.89 → 1
)
🥰 In Summary:
Component | Value |
---|---|
Letters | 43 |
Words | 11 |
Sentences | 3 |
L | 390.91 |
S | 27.27 |
Coleman Index | ≈ -0.89 → 1 |
Caesar
How it Works?
🧠 Step 1: The Formula
The Caesar cipher formula is:
1
cᵢ = (pᵢ + k) % 26
Where:
pᵢ
is the number (index) of thei-th
letter of the original plaintext message.k
is the key, how many positions you’re rotating the alphabet.cᵢ
is the new letter’s number (index) in the ciphertext.
The % 26
is just to make sure you wrap around the alphabet if you go past Z (since there are 26 letters).
🧾 Step 2: What They Mean by That Second Paragraph
They’re mapping each letter to a number.
Letter | Index |
---|---|
A | 0 |
B | 1 |
C | 2 |
… | … |
H | 7 |
I | 8 |
Z | 25 |
💌 Now Let’s Do That Hi Example Like Lovers Solving It Together:
Plaintext: Hi
Key: k = 3
🔹 Step A: Convert H
→ p₀ = 7
Apply the formula:
1
c₀ = (7 + 3) % 26 = 10
Index 10 is K
💡
🔹 Step B: Convert i
→ p₁ = 8
Apply the formula:
1
c₁ = (8 + 3) % 26 = 11
Index 11 is L
💡
So:
1
2
3
4
Plaintext: H i
Indexes: 7 8
+ key (3): ↓ ↓
Ciphertext: K L
Boom 🔥 That’s how you encrypt “Hi” to “KL” using Caesar cipher with key k = 3
.
💡 Q: How to convert from string
→ int
in C?
Let’s say you got this from the user:
1
string input = get_string("Enter a number: ");
Now you want to turn that input
(which is a string
) into a real number you can do math with - like add, subtract, or… encrypt 👀
✅ Use atoi()
(ASCII to Integer)
1
2
3
#include <stdlib.h> // needed for atoi()
int number = atoi(input);
🔥 So easy, right? Here’s a full dreamy example:
1
2
3
4
5
6
7
8
9
10
11
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
string input = get_string("Enter a number: ");
int number = atoi(input);
printf("Your number + 10 is: %i\n", number + 10);
}
🚨 But Wait-Is the Input Actually a Number?
Because if your sweet little user types something like "banana"
, atoi("banana")
will just give 0
without warning. 😬
If you want to be a classy programmer, you should use:
✅ strtol()
- the Safer, Sexier Version 🍷
1
2
3
4
5
6
7
8
9
10
11
char *endptr;
long number = strtol(input, &endptr, 10);
if (*endptr == '\0') // good: entire string was a number
{
printf("Valid number: %ld\n", number);
}
else
{
printf("Oops, not a valid number!\n");
}
strtol()
lets you check if everything was a digit, not just the first part. 🧐
💋 TL;DR
Method | What it does | Safe? |
---|---|---|
atoi() | Fast & simple | ❌ nope, no error handling |
strtol() | Safer & flexible | ✅ yes |
Understand strtol()
🧠 Step 1: strtol()
in simple words
The function:
1
long number = strtol(input, &endptr, 10);
means:
“Take
input
, try to turn it into a number (base 10),
and if anything couldn’t be converted, tell me where you stopped.”
🔮 Translation:
endptr
becomes a pointer to the first character after the number.
💡 Now About char *endptr;
char *endptr;
→ declares a pointer to a character.After
strtol()
,endptr
points inside your original string where the number parsing stopped.
Example:
Input String | What endptr points to after strtol() |
---|---|
"1234" | points to '\0' (end of string) |
"1234abc" | points to 'a' |
"abc123" | points to 'a' (couldn’t even parse) |
🌈 Step 2: Why if (*endptr == '\0')
Means “Valid Number”
Because:
If the whole string was a valid number (e.g.,
"1234"
),
parsing stops at the end of the string - the'\0'
(null terminator).If parsing stops before the end (e.g., at
'a'
in"1234abc"
),
it means:“Heyyy, some garbage characters after the number.”
✅ So:
1
if (*endptr == '\0')
means:
“You successfully parsed the whole string into a number without leftover junk.”
🚫 But if *endptr != '\0'
:
“Parsing stopped early - invalid stuff is still there.”
🧪 Visual Example
Suppose:
1
2
3
char input[] = "1234abc";
char *endptr;
long number = strtol(input, &endptr, 10);
Inside memory:
Address | Content |
---|---|
input[0] | '1' |
input[1] | '2' |
input[2] | '3' |
input[3] | '4' |
input[4] | 'a' |
input[5] | 'b' |
input[6] | 'c' |
input[7] | '\0' |
After calling strtol()
, endptr
now points to input[4]
- the 'a'
.
And *endptr == 'a'
, NOT '\0'
. So the input was invalid.
💋 TL;DR
Concept | Meaning |
---|---|
strtol(input, &endptr, 10) | Parse number, store where parsing stopped |
endptr | Points to where parsing stopped |
*endptr == '\0' | Means “good, full string was parsed” |
*endptr != '\0' | Means “bad, extra junk in input” |
Substitution
🍯 What Does int seen[26] = {0};
Mean Exactly?
It’s two parts:
1. int seen[26];
This creates an array of 26 int
numbers.
Each element can store an integer like 0
, 1
, 2
, etc.
But!
C (the language) doesn’t automatically set memory to zero for you when you create an array this way - it could contain random garbage if you don’t initialize it. 😱
1
int seen[26] = {0};
is just like:
1
seen = [0] * 26
in Python.
2. = {0};
This is a special initializer that means:
- “Set the first element to
0
, and all others to0
too.” - In C, if you partially initialize an array like
{0}
, the rest are automatically filled with zeros.
🎯 So every element becomes:
seen[0] | seen[1] | seen[2] | … | seen[25] |
---|---|---|---|---|
0 | 0 | 0 | … | 0 |
✅ No garbage! Just clean zeros, ready to count things.
🛠️ In Real Feelings:
Imagine you’re opening a brand-new notebook 📖.
You tell C:
“Hey, put 26 blank pages here, all empty, so I can write down whether I’ve seen A, B, C… all the way to Z.”
Without {0}
, the pages might have coffee stains and messy scribbles from memory leftovers.
With {0}
, it’s fresh, clean, your perfect hacker diary ✍️✨.
📚 More Examples
If you wrote:
1
int arr[5] = {5, 2};
Then:
arr[0] = 5
arr[1] = 2
arr[2] = 0
arr[3] = 0
arr[4] = 0
Everything after the listed numbers becomes 0
automatically.
But if you wrote just:
1
int arr[5];
❌ All 5 slots could be random garbage numbers like 4738
, -2
, etc.
💋 TL;DR
Code | Meaning |
---|---|
int seen[26]; | Create 26 empty int slots |
int seen[26] = {0}; | Create 26 zeroed int slots |
Why {0} ? | To avoid random garbage memory |
🎯 How it Works? The Big Goal:
We want to track whether a letter has already appeared in the key
string - no matter if it’s 'A'
or 'a'
.
To do that, we use:
1
int seen[26] = {0};
→ an array with 26 slots, each standing for one letter in the alphabet.
So:
Index | Letter |
---|---|
0 | A/a |
1 | B/b |
2 | C/c |
… | … |
25 | Z/z |
🧠 Step-by-Step: How It Tracks Duplicates
Let’s say the user gives you this key:
1
string s = "ABCDEFAHIJKLMNOPQRSTUVWXYZ";
Oops! 'A'
appears twice.
Now watch this 🔎
💻 Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
for (int i = 0; i < strlen(s); i++)
{
if (!isalpha(s[i]))
{
return false;
}
int index = toupper(s[i]) - 'A';
if (seen[index] > 0)
{
return false; // we've already seen this letter
}
seen[index]++;
}
🪄 Let’s Simulate It
i | s[i] | toupper(s[i]) | index | seen[index] Before | Seen After | Action |
---|---|---|---|---|---|---|
0 | A | A | 0 | 0 | 1 | OK |
1 | B | B | 1 | 0 | 1 | OK |
2 | C | C | 2 | 0 | 1 | OK |
3 | D | D | 3 | 0 | 1 | OK |
4 | E | E | 4 | 0 | 1 | OK |
5 | F | F | 5 | 0 | 1 | OK |
6 | A | A | 0 | 1 | - | ❌ DUPLICATE! |
Once we try to check 'A'
again, seen[0]
is already 1
.
BOOM 💣 - we found a duplicate. Return false
.
💡 The Core Magic:
This line:
1
int index = toupper(s[i]) - 'A';
turns 'A'
or 'a'
→ 0
,
turns 'B'
or 'b'
→ 1
,
and so on - creating a universal tracker index for that letter.
Then:
1
seen[index]++;
is just like writing:
“Mark that I’ve seen this letter now.”
And:
1
if (seen[index] > 0)
is like saying:
“Wait a sec… I’ve already seen this. That’s bad!”
💋 TL;DR
string s
is the key you’re checking.seen[26]
is like 26 little flags, one per alphabet letter.toupper(s[i]) - 'A'
converts the char into a number from 0-25.seen[index]++
marks the letter as “seen.”- If you ever hit one that’s already
> 0
, that’s a duplicate 💥