DevTools to .ts: My Full Walkthrough of Building M3U8-Probe
build a dual-script video extractor from raw .m3u8 or direct .ts links, with async download, DevTools tactics, and ts-to-mp4 merging. Plus: guide for using link_dl.py & m3u8_dl.py.
Well, weβre all big fans of free comedies and showsβ¦ so why not grab a bucket of popcorn and let Python do the dirty work for us?
This is a detailed guide for M3U8-Probe. Hope this helps! And have fun! π
Heads-up: These are script-based tools for now - but Iβm planning to wrap them into CLI tools in the future, once they prove rock solid.
π§ͺ Tested Sites (So far)
These scripts have successfully worked on the following real-world sites(all for phree shows!). Feel free to test more.
Know more sites that work? PRs or issues welcome!
https://streamwatch.online/media/tmdb-movie-574475-Final%20Destination%20Bloodlines
https://ww4.fmovies.co/film/phineas-and-ferb-season-5-1630859174/
https://moviexfilm.com/on-swift-horses-2/
https://www.bttwo.me/
https://xiaoyakankan.com/post/f9905e1327.html?vod=190_11408-9
P.S. Rumor has itβ¦ these scripts also work on some spicy sites. But donβt ask me how I know.
π§ͺ Which Script Should You Use?
π» Step 1: Can you find .m3u8
files in DevTools?
(e.g. https://play.com/20240701/1rIWCjMw/2000kb/hls/index.m3u8
)
- Yes β
m3u8_dl.py
- No β Go to Step 2
π Step 2: Can you find sequential segment links in DevTools?
Look at how the numbers increase in the URLs:
Segment Style | Example | Pattern | Script to Use |
---|---|---|---|
Increasing at the end | https://play.gotomymv.life/.../000.ts β .../145.ts | .../000.ts β .../001.ts β β¦ | link_dl.py |
Increasing in the middle | https://.../seg-1-v1-a1.ts?... β .../seg-177-v1-a1.ts?... | seg-1-... β seg-2-... β β¦ | link_dl.py |
If you spot this kind of numbering - anywhere in the link - and it increases chunk by chunk, thatβs your green light for link_dl.py
.
Yes β Use
link_dl.py
No β Sorry bruh, these scripts wonβt help here. Might need to write a custom one or inspect more.
π¬ m3u8_dl.py
- Extracting .m3u8
Files
π What m3u8_dl.py
Does
Async segment scanner + downloader + merger
m3u8_dl.py
is for.m3u8
playlist-based video downloads.
It works best when the playlist contains a clean list of video chunks.
π Step 1: Read lines
Reads your .m3u8
playlist line by line, sniffs out only the good stuff (a.k.a. segment filenames).
Needs a full URL to the .m3u8 file, so it can reconstruct (segmentsβ) full URLs like a smooth operator.
πΏ Step 2: Download & Merge
Each segment is downloaded asynchronously using aiohttp
. Then itβs auto-merged into a single .ts
file. Just play and chill.
π οΈ How to Find the .m3u8
Link
Open DevTools Hit
F12
while the video is loading. If no network activity shows, hitCtrl + R
to reload.Filter Requests Go to the Network tab and filter by
XHR
andMedia
(you can also addHTML
andJS
in case some sites are tricky).Find the
.m3u8
file Look for a file ending in.m3u8
- there could be various names, likeindex.m3u8
,playlist.m3u8
, or even something sneaky without βm3u8β in the name at all. Click it, and check the Response tab.
You wanna see a full m3u8 playlist like this:#EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:4 #EXT-X-PLAYLIST-TYPE:VOD #EXT-X-MEDIA-SEQUENCE:0 #EXTINF:2.085, /20240701/1rIWCjMw/2000kb/hls/Bhr5jO2P.jpg #EXTINF:3.17, /20240701/1rIWCjMw/2000kb/hls/k5gKM4Iw.jpg ... #EXT-X-ENDLIST
Though these segments end with
.jpg
, theyβre actually.ts
video chunks in disguise.
You can expose their true identity using your good olβ friendcurl
.Pick one, manually append it to the base URL, and run:
1
curl -I 'https://play.modujx12.com/20240701/1rIWCjMw/2000kb/hls/Bhr5jO2P.jpg'
If the response looks like this (instead of returning image metadata), then youβve caught a
.ts
file wearing a.jpg
mask:Accept-Ranges: bytes Access-Control-Allow-Headers: X-Requested-With Access-Control-Allow-Methods: POST, GET, OPTIONS Access-Control-Allow-Origin: * Content-Disposition: attachment; filename="Bhr5jO2P.ts" Content-Length: 324676 Content-Type: application/octet-stream Date: Wed, 18 Jun 2025 06:34:19 GMT Etag: "66831471-4f444" Last-Modified: Mon, 01 Jul 2024 20:41:21 GMT ...
Some CDN streaming services do this to obfuscate or bypass firewalls/caching rules.
Right click that request, Copy Value -> Copy URL, that is your
remote_m3u8_url
.
π Example .m3u8
file
Hereβs an example .m3u8
file.
Toward the end, youβll see lines like:
#EXT-X-DISCONTINUITY
#EXT-X-KEY:METHOD=NONE
#EXTINF:3.333,
/20250612/AOraqYHt/8984kb/hls/2LZ5mvIk.jpg
...
#EXT-X-ENDLIST
Now compare that with more typical chunks like:
#EXTINF:2.085,
/20240701/1rIWCjMw/2000kb/hls/YTXeysd2.jpg
#EXTINF:0.542,
/20240701/1rIWCjMw/2000kb/hls/6msgPZUM.jpg
If you try downloading from the later chunk (/20250612/AOraqYHt/8984kb/hls/0gOMoF2r.jpg
), youβll often findβ¦ surprise! Itβs an ad.
β οΈ Watch for #EXT-X-DISCONTINUITY
This tag means:
βHeads up! The next media chunk is from a different timeline or media source.β
If you see multiple #EXT-X-DISCONTINUITY
tags in a playlist, theyβre often marking:
- The boundary between ads and show
- Different chapters or transitions
- Totally different video parts spliced together
Thatβs why get_segs()
steps in-to filter out the trash and keep only the valid prefixes. Learn more in this part - Troubleshooting.
π§ͺ Run m3u8_dl.py
- Download Using .m3u8
Playlist
Once youβve got your remote_m3u8_url
ready, you can use this script to download the full video.
π How to Use m3u8_dl.py
Edit the parameters directly under the if __name__ == "__main__":
block in the script:
1
2
3
4
5
6
7
8
9
10
11
# Configuration section
remote_m3u8_url = "https://play.com/20240701/1rIWCjMw/2000kb/hls/index.m3u8"
output_file = "wrecked_s01e01.ts"
# Headers used for HTTP requests
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36",
"Referer": "https://example.com",
"Origin": "https://example.com"
}
π§· Parameter Breakdown for m3u8_dl.py
Set these values in the script under if __name__ == "__main__":
before running.
remote_m3u8_url
- Remote full URL to the .m3u8 file (in DevTool).Example:
1
remote_m3u8_url = "https://play.com/20240701/1rIWCjMw/2000kb/hls/index.m3u8"
output_file
- Your Output FilenameMake it cute or clean, e.g.:
1
output_file = "wrecked_s01e10.ts"
headers
- Request Headers for DownloadingUsually, just tweak
Referer
andOrigin
based on what you saw in the browser.Examples:
1 2 3 4 5
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36", "Referer": "https://www.bttwo.me", "Origin": "https://www.bttwo.me" }
Peek at the actual segment request in DevTool β
Headers
tab β scroll toRequest Headers
to copy & adapt.(Optional) Retry Times in
download_seg()
Sometimes the net gets cranky, or the server ghosts us mid-download.
So by default, each segment gets retried up to 3 times before giving up.
Just tweak this line in the function if you wanna change:
1
async def download_seg(session, sem, base_url, seg, idx, output_dir, max_retries=3): # π Change this "3" to any number you want
Run m3u8_dl.py
1
2
3
4
5
6
7
8
git clone https://github.com/kay-a11y/M3U8-Probe.git
cd M3U8-Probe
# π¦ Install dependencies
pip install -r requirements.txt
# π Run the script
python src/m3u8_dl.py
π Sample Output
2025-06-17 18:36:02,307 - INFO - base_url = https://play.modujx12.com/20240701/1rIWCjMw/2000kb/hls/
2025-06-17 18:36:08,707 - INFO - m3u8_file = /home/z3phyr/repos/GazeKit/hub-generalM3u8/data/wrecked_s01e01/playlist.m3u8
2025-06-17 18:36:08,710 - INFO - segs length = 623
π Downloading: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 623/623 [00:36<00:00, 7.15seg/s]
βββ ...
βββ README.md
βββ wrecked_s01e10.ts ππ» Your video is here!
βββ data
β βββ wrecked_s01e10
β βββ index.m3u8 ππ» The playlist
βββ src
βββ link_dl.py
βββ m3u8_dl.py
π¬ link_dl.py
- Extracting Request URLs
Letβs get your hands dirty with raw segment sniffing and real-time scraping.
π οΈ Find the Request URLs
Open DevTools Hit
F12
while the video is loading. If no network activity shows, hitCtrl + R
to reload.Filter Requests Go to the Network tab and filter by
XHR
andMedia
(you can also addHTML
andJS
in case some sites are tricky).Find the Request URLs Look for something like:
https://play.gotomymv.life/ts/745bc3b1.../21633/000.png.ts
or
https://examplevideo.com/hls/videos/.../seg-1-v1-a1.ts?validfrom=...
Check if the number in the URL increases between requests (e.g.,
000.ts
β001.ts
,seg-1.ts
βseg-2.ts
)You donβt need to find the last segment - as long as thereβs continuity, the script will handle the rest.
π What link_dl.py
Does
link_dl.py
is perfect for URL-based segment crawling, not.m3u8
playlist-based downloading.
π¦ Step 1: Scanning
It auto-generates segment links by incrementing the number in the known request pattern.
π Step 2: Wrapping into Tasks
Each URL is wrapped into tasks
for async downloading.
ποΈ Step 3: Download & Merge
Downloaded segments are merged into a single final video.
π§ͺ Run link_dl.py
- Download from Sequential Segment Links
Now that youβve confirmed the request URL structure, itβs time to unleash link_dl.py
and start downloading full videos like a stealthy ninja with a curl addiction.
π How to Use link_dl.py
Edit the Config Below
if __name__ == "__main__":
Paste your target config like this:
1 2 3 4 5 6 7
base_url = "https://play.gotomymv.life/ts/40c26f81749767102/L3R5MzAvbWVpanUv5bCR5bm06LCi5bCU6aG_UzA1RTA4/20763/" output_file = "Sheldon_s05e08.ts" headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36", "Referer": "https://www.bttwo.me", "Origin": "https://www.bttwo.me" }
Check the Segment Filename Pattern
In the script (around line 65), youβll see this key part:
1
seg_name = f"{i:03d}.png.ts"
This tells the script to download segments like:
000.png.ts, 001.png.ts, 002.png.ts ...
Modify this line if your segment pattern is different, like:
f"seg-{i}-v1-a1.ts"
f"{i}.ts"
f"{i:03}.jpg"
- β¦
Match it exactly to what you saw in DevTools - See below.
π§· Parameter Breakdown for link_dl.py
seg_name
At line 65, youβll see:
1
seg_name = f"{i:03d}.png.ts"
This is where the automation happens. Replace it to match the increasing part of your segment URL.
Example:
Full segment URL:
https://examplevideo.com/hls/videos/202409/09/9999/720P_4000K_99.mp4/seg-1-v1-a1.ts?validfrom=1749397780&validto=1749404980&hash=qwerty
The changing bit is
seg-1
,seg-2
,seg-3
β¦So your line should be:
1
seg_name = f"{i}-v1-a1.ts?validfrom=1749397780&validto=1749404980&hash=qwerty"
Just keep the rest of the URL static, and replace the counter with
{i}
or{i:03d}
depending on the pattern (zero-padded or not).base_url
This is everything before the segment name starts changing.
Using the same example:
https://examplevideo.com/hls/videos/202409/09/9999/720P_4000K_99.mp4/seg-1-v1-a1.ts?validfrom=1749397780&validto=1749404980&hash=qwerty
Here your
base_url
should be:1
base_url = "https://examplevideo.com/hls/videos/202409/09/9999/720P_4000K_99.mp4/seg-"
output_file
Examples:
1
output_file = "Sheldon_s05e08.ts"
headers
- Request Headers for DownloadingUsually, just tweak
Referer
andOrigin
based on what you saw in the browser.Examples:
1 2 3 4 5
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36", "Referer": "https://example.com", "Origin": "https://example.com" }
Peek at the actual segment request in DevTool β
Headers
tab β scroll toRequest Headers
to copy & adapt.(Optional) Retry Times in
download_seg()
Sometimes the net gets cranky, or the server ghosts us mid-download.
So by default, each segment gets retried up to 3 times before giving up.Just tweak this line in the function if you wanna change:
1
async def download_seg(session, sem, seg_name, output_path, max_retries=3): # π Change this "3" to any number you want
Run link_dl.py
1
2
3
4
5
6
7
8
git clone https://github.com/kay-a11y/M3U8-Probe.git
cd M3U8-Probe
# π¦ Install dependencies
pip install -r requirements.txt
# π Run the script
python src/link_dl.py
π Sample Output
1
2
3
4
π Scanning segments: 148seg [01:39, 1.49seg/s]
π Downloading: 99%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 146/148 [00:14<00:00, 16.33seg/s]
2025-06-12 18:11:49,064 - WARNING - π₯ 020.png.ts error: (try 1/3)
π Downloading: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 148/148 [00:17<00:00, 8.41seg/s]
1
2
3
4
5
6
7
8
βββ ...
βββ README.md
βββ Sheldon_s05e08.ts ππ» Your video is here!
βββ data
β βββ Sheldon_s05e08 ππ» Will be empty after merging, feel free to delete this
βββ src
βββ link_dl.py
βββ m3u8_dl.py
π¬ (Optional) Convert .ts
to .mp4
with ffmpeg
When youβre done collecting all the juicy .ts
segments into one big file, you might want to dress it up a bit - .mp4
style. Hereβs how to glam it up with zero re-encoding:
1
2
3
4
5
# π§° Install ffmpeg (if not already)
sudo apt install ffmpeg
# π Convert .ts to .mp4 without re-encoding
ffmpeg -i whole.ts -c copy final.mp4
This command simply re-wraps the
.ts
into.mp4
- no quality loss, no long wait. Just pure elegance.
π Troubleshooting
Output video looks complete in size, but actually only plays a few seconds
Normally, we might use if not line.strip().startswith("#")
to identify valid media segment lines in an .m3u8
file.
However, when ad chunks are present - typically following a #EXT-X-DISCONTINUITY
tag - this method becomes unreliable.
If we stick to the original approach when processing .m3u8
files that include ad or junk segments, the resulting video might behave strangely.
For instance, you could end up with a clip thatβs only a few seconds long but has a file size as large as a full video - like 10 seconds weighing in at 200 MB.
Hereβs another method to extract valid media segment lines from an .m3u8
file (get_segs()
in m3u8_dl.py
):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
legit_prefix = "/" + "/".join(base_url.split("/")[3:]) # e.g. /20240701/xxx/yyy/hls
with open(m3u8_file, 'r', encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line or line.startswith("#"):
continue # skip empty/comment lines
# case 1: looks like /...jpg (actual .ts style)
if line.startswith(legit_prefix):
segs.append(line)
# case 2: plain '000.ts'-style segment
elif "/" not in line and line.endswith(".ts"):
segs.append(line)
# case 3: absolute URL with .ts extension
elif line.startswith("http") and line.endswith(".ts"):
segs.append(line)
This part first filters out empty lines and comments (those starting with #
). Then, it categorizes segment lines into three types:
Path-style segments that look like
/...jpg
but are actually.ts
: These are validated using alegit_prefix
, which is parsed from the remote.m3u8
URL (e.g./20240701/xxx/yyy/hls
). This helps exclude unrelated junk segments or ads.Plain numbered
.ts
segments: For example,000.ts
,002.ts
, etc. These are assumed to be valid by default.Absolute URLs ending in
.ts
: For example,https://astream.org/stream/613615ae/1080/index0.ts
. These are also accepted as legit segments.
You can tweak the logic to adapt to different
.m3u8
formats depending on the siteβs layout or how the playlist is structured.
[Errno 111] Connection refused
2025-06-17 17:27:32,018 - ERROR - π₯ Failed to download M3U8: SOCKSHTTPSConnectionPool(host='play.modujx12.com', port=443): Max retries exceeded with url: /20240701/1rIWCjMw/2000kb/hls/index.m3u8 (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x7172bf6fdcf0>: Failed to establish a new connection: [Errno 111] Connection refused'))
This error usually means your connection attempt was blocked or your proxy isnβt configured correctly.
If youβre using a proxy, double check your port settings.
Also run:
1
git config --global --list
to ensure your Git proxy settings arenβt interfering with Python requests.
See this section for troubleshooting tips: Solve Port Issues
Debug with logging into a file
Try to write logs right into a file, so you can keep receipts on every .ts
, every 404, every little juicy fail or win.
To log into a file, just add the filename
and filemode
:
1
2
3
4
5
6
7
8
9
import logging
from datetime import date
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s',
filename=f"logs/{date.today()}.log",
filemode='w'
)