moved URL shortening code to https://github.com/pythonfluente/pythonfluente2e
This commit is contained in:
parent
8c8c08170a
commit
cf3161ca00
@ -1,52 +1,6 @@
|
|||||||
# Short links for URLs in the book
|
This file is deployed as `.htaccess` to the FPY.LI domain
|
||||||
|
to map short URLs in Fluent Python to the original URLs.
|
||||||
|
|
||||||
## Problem: link rot
|
To update it, I use tools in this other repo:
|
||||||
|
|
||||||
_Fluent Python, Second Edition_ has more than 1000 links to external resources.
|
https://github.com/pythonfluente/pythonfluente2e
|
||||||
Inevitably, some of those links will rot as time passes.
|
|
||||||
But I can't change the URLs in the print book...
|
|
||||||
|
|
||||||
## Solution: indirection
|
|
||||||
|
|
||||||
I replaced almost all URLs in the book with shortened versions that go through the `fpy.li` site which I control.
|
|
||||||
The site has an `.htaccess` file with *temporary* redirects.
|
|
||||||
|
|
||||||
When I find out a link is stale, I can thange the redirect in `.htaccess` to a new target,
|
|
||||||
which may be a link to copy in the Internet Archive's
|
|
||||||
[Wayback Machine](https://archive.org/web/)
|
|
||||||
o the link in the book is back in service through the updated redirect.
|
|
||||||
|
|
||||||
|
|
||||||
## Help wanted
|
|
||||||
|
|
||||||
Please report broken links as bugs in the [`FPY.LI.htaccess`](FPY.LI.htaccess) file.
|
|
||||||
Also, feel free to send pull requests with fixes to that file.
|
|
||||||
When I accept a PR, I will redeploy it to `fpy.li/.htaccess`.
|
|
||||||
|
|
||||||
|
|
||||||
## Details
|
|
||||||
|
|
||||||
Almost all URLs in the book are replaced with shortened versions like
|
|
||||||
[`http://fpy.li/1-3`](http://fpy.li/1-3)—for chapter 1, link #3.
|
|
||||||
|
|
||||||
There are also custom short URLs like
|
|
||||||
[`https://fpy.li/code`](https://fpy.li/code) which redirects to the example code repository.
|
|
||||||
I used custom short URLs for URLs with 3 or more mentions, or links to PEPs.
|
|
||||||
|
|
||||||
Exceptions:
|
|
||||||
|
|
||||||
- URLs with `oreilly` in them are unchanged;
|
|
||||||
- `fluentpython.com` URL (with no path) is unchanged;
|
|
||||||
|
|
||||||
The `custom.htaccess` file contains redirects with custom names
|
|
||||||
plus numbered URLs generated from the links in each chapter in
|
|
||||||
the Second Edition in English.
|
|
||||||
|
|
||||||
`short.htaccess` has redirects made by `short.py`, starting
|
|
||||||
with the Second Edition in Brazilian Portuguese.
|
|
||||||
|
|
||||||
```shell
|
|
||||||
cat custom.htaccess short.htaccess > FPY.LI.htaccess
|
|
||||||
```
|
|
||||||
|
|
||||||
`FPY.LI.htaccess` is deployed at the root folder in `http://fpy.li`.
|
|
||||||
|
File diff suppressed because it is too large
Load Diff
@ -1,2 +0,0 @@
|
|||||||
#!/bin/bash
|
|
||||||
scp FPY.LI.htaccess dh_i4p2ka@fpy.li:~/fpy.li/.htaccess
|
|
@ -1,47 +0,0 @@
|
|||||||
https://www.oreilly.com/library/view/fluent-python-2nd/9781492056348/
|
|
||||||
https://dask.org/
|
|
||||||
http://example.com/1572039572038573208
|
|
||||||
http://www.unicode.org/
|
|
||||||
https://www.techcrunch.com/2024/startup-funding-trends
|
|
||||||
https://blog.medium.com/writing-tips-for-beginners
|
|
||||||
https://github.com/microsoft/typescript
|
|
||||||
https://stackoverflow.com/questions/javascript-async-await
|
|
||||||
https://www.reddit.com/r/programming/hot
|
|
||||||
https://docs.google.com/spreadsheets/create
|
|
||||||
https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
||||||
https://www.amazon.com/dp/B08N5WRWNW
|
|
||||||
https://support.apple.com/iphone-setup-guide
|
|
||||||
https://www.wikipedia.org/wiki/Machine_Learning
|
|
||||||
https://www.linkedin.com/in/johndoe123
|
|
||||||
https://www.instagram.com/p/CxYz123AbC/
|
|
||||||
https://twitter.com/elonmusk/status/1234567890
|
|
||||||
https://www.facebook.com/events/987654321
|
|
||||||
https://drive.google.com/file/d/1AbCdEfGhIjKlMnOp/view
|
|
||||||
https://www.dropbox.com/s/qwerty123/document.pdf
|
|
||||||
https://zoom.us/j/1234567890?pwd=abcdef
|
|
||||||
https://calendly.com/janedoe/30min-meeting
|
|
||||||
https://www.shopify.com/admin/products/new
|
|
||||||
https://stripe.com/docs/api/charges/create
|
|
||||||
https://www.paypal.com/invoice/create
|
|
||||||
https://mailchimp.com/campaigns/dashboard
|
|
||||||
https://analytics.google.com/analytics/web/
|
|
||||||
https://console.aws.amazon.com/s3/buckets
|
|
||||||
https://portal.azure.com/dashboard
|
|
||||||
https://www.figma.com/file/AbCdEf123456/design-system
|
|
||||||
https://www.notion.so/workspace/project-notes
|
|
||||||
https://trello.com/b/AbCdEfGh/marketing-board
|
|
||||||
https://slack.com/app_redirect?channel=general
|
|
||||||
https://discord.gg/AbCdEfGh123
|
|
||||||
https://www.twitch.tv/streamername/videos
|
|
||||||
https://www.spotify.com/playlist/37i9dQZF1DXcBWIGoYBM5M
|
|
||||||
https://www.netflix.com/browse/genre/83
|
|
||||||
https://www.hulu.com/series/breaking-bad-2008
|
|
||||||
https://www.airbnb.com/rooms/12345678
|
|
||||||
https://www.booking.com/hotel/us/grand-plaza.html
|
|
||||||
https://www.expedia.com/flights/search?trip=roundtrip
|
|
||||||
https://www.uber.com/ride/request
|
|
||||||
https://www.doordash.com/store/pizza-palace-123
|
|
||||||
https://www.grubhub.com/restaurant/tacos-el-rey-456
|
|
||||||
https://www.zillow.com/homes/for_sale/San-Francisco-CA
|
|
||||||
https://www.craigslist.org/about/sites
|
|
||||||
https://www.python.org/dev/peps/pep-0484/
|
|
@ -1,14 +0,0 @@
|
|||||||
# content of short.htaccess file created and managed by short.py
|
|
||||||
|
|
||||||
# appended: 2025-05-23 15:12:13
|
|
||||||
RedirectTemp /22 https://pythonfluente.com/2/#pattern_matching_case_study_sec
|
|
||||||
RedirectTemp /23 https://pythonfluente.com/2/#how_slicing_works
|
|
||||||
RedirectTemp /24 https://pythonfluente.com/2/#sliceable_sequence
|
|
||||||
RedirectTemp /25 https://pythonfluente.com/2/#virtual_subclass_sec
|
|
||||||
RedirectTemp /26 https://pythonfluente.com/2/#environment_class_ex
|
|
||||||
RedirectTemp /27 https://pythonfluente.com/2/#subclass_builtin_woes
|
|
||||||
RedirectTemp /28 https://pythonfluente.com/2/#slots_section
|
|
||||||
RedirectTemp /29 https://pythonfluente.com/2/#typeddict_sec
|
|
||||||
RedirectTemp /2a https://pythonfluente.com/2/#problems_annot_runtime_sec
|
|
||||||
RedirectTemp /2b https://pythonfluente.com/2/#legacy_deprecated_typing_box
|
|
||||||
RedirectTemp /2c https://pythonfluente.com/2/#positional_pattern_implement_sec
|
|
@ -1,95 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
|
|
||||||
"""
|
|
||||||
short.py generates unique short URLs.
|
|
||||||
|
|
||||||
This script reads lines from stdin or files named as arguments, then:
|
|
||||||
|
|
||||||
1. retrieves or creates new short URLs, taking into account existing RedirectTemp
|
|
||||||
directives in custom.htaccess or short.htaccess;
|
|
||||||
2. appends RedirectTemp directives for newly created short URLs to short.htaccess;
|
|
||||||
3. outputs the list of (short, long) URLs retrieved or created.
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
import fileinput
|
|
||||||
import itertools
|
|
||||||
from collections.abc import Iterator
|
|
||||||
from time import strftime
|
|
||||||
|
|
||||||
HTACCESS_CUSTOM = 'custom.htaccess'
|
|
||||||
HTACCESS_SHORT = 'short.htaccess'
|
|
||||||
HTACCESS_FILES = (HTACCESS_CUSTOM, HTACCESS_SHORT)
|
|
||||||
BASE_DOMAIN = 'fpy.li'
|
|
||||||
|
|
||||||
|
|
||||||
def load_redirects() -> tuple[dict, dict]:
|
|
||||||
redirects = {}
|
|
||||||
targets = {}
|
|
||||||
for filename in HTACCESS_FILES:
|
|
||||||
with open(filename) as fp:
|
|
||||||
for line in fp:
|
|
||||||
if line.startswith('RedirectTemp'):
|
|
||||||
_, short, long = line.split()
|
|
||||||
short = short[1:] # Remove leading slash
|
|
||||||
assert short not in redirects, f'{filename}: duplicate redirect from {short}'
|
|
||||||
# htaccess.custom is live since 2022, we can't change it remove duplicate targets
|
|
||||||
if filename != HTACCESS_CUSTOM:
|
|
||||||
assert long not in targets, f'{filename}: duplicate redirect to {long}'
|
|
||||||
redirects[short] = long
|
|
||||||
targets[long] = short
|
|
||||||
return redirects, targets
|
|
||||||
|
|
||||||
|
|
||||||
SDIGITS = '23456789abcdefghjkmnpqrstvwxyz'
|
|
||||||
|
|
||||||
|
|
||||||
def gen_short(start_len=1) -> Iterator[str]:
|
|
||||||
"""Generate every possible sequence of SDIGITS, starting with start_len"""
|
|
||||||
length = start_len
|
|
||||||
while True:
|
|
||||||
for short in itertools.product(SDIGITS, repeat=length):
|
|
||||||
yield ''.join(short)
|
|
||||||
length += 1
|
|
||||||
|
|
||||||
|
|
||||||
def gen_unused_short(redirects: dict) -> Iterator[str]:
|
|
||||||
"""Generate next available short URL of len >= 2."""
|
|
||||||
for short in gen_short(2):
|
|
||||||
if short not in redirects:
|
|
||||||
yield short
|
|
||||||
|
|
||||||
|
|
||||||
def shorten(urls: list[str]) -> list[tuple[str, str]]:
|
|
||||||
"""Return (short, long) pairs, appending directives to HTACCESS_SHORT as needed."""
|
|
||||||
redirects, targets = load_redirects()
|
|
||||||
iter_short = gen_unused_short(redirects)
|
|
||||||
pairs = []
|
|
||||||
timestamp = strftime('%Y-%m-%d %H:%M:%S')
|
|
||||||
with open(HTACCESS_SHORT, 'a') as fp:
|
|
||||||
for long in urls:
|
|
||||||
assert BASE_DOMAIN not in long, f'{long} is a {BASE_DOMAIN} URL'
|
|
||||||
if long in targets:
|
|
||||||
short = targets[long]
|
|
||||||
else:
|
|
||||||
short = next(iter_short)
|
|
||||||
redirects[short] = long
|
|
||||||
targets[long] = short
|
|
||||||
if timestamp:
|
|
||||||
fp.write(f'\n# appended: {timestamp}\n')
|
|
||||||
timestamp = None
|
|
||||||
fp.write(f'RedirectTemp /{short} {long}\n')
|
|
||||||
pairs.append((short, long))
|
|
||||||
|
|
||||||
return pairs
|
|
||||||
|
|
||||||
|
|
||||||
def main() -> None:
|
|
||||||
"""read URLS from filename arguments or stdin"""
|
|
||||||
urls = [line.strip() for line in fileinput.input(encoding='utf-8')]
|
|
||||||
for short, long in shorten(urls):
|
|
||||||
print(f'{BASE_DOMAIN}/{short}\t{long}')
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
main()
|
|
Loading…
x
Reference in New Issue
Block a user