# HG changeset patch # User David Barts # Date 1577422598 28800 # Node ID 9df9ff8cecde8bbf15f835decf93b4047ac614a1 # Parent da3fb2312c88632d1b498a8c53f5e3bfef764d65 Undo that; ignoring
 is a sticky wicket.

diff -r da3fb2312c88 -r 9df9ff8cecde curlers.py
--- a/curlers.py	Thu Dec 26 20:38:37 2019 -0800
+++ b/curlers.py	Thu Dec 26 20:56:38 2019 -0800
@@ -21,12 +21,13 @@
     "'bout", "'nuff", "'round", "'cause" , "'em" ]
 
 # HTML tags that enclose raw data
-_RAW = set(["script", "style", "pre"])
+_RAW = set(["script", "style"])
 
 # HTML block elements
 _BLOCK = set([
-    "address", "blockquote", "div", "dl", "fieldset", "form", "h1", "h2",
-    "h3", "h4", "h5", "h6", "hr", "noscript", "ol", "p", "table", "ul"
+    "address", "blockquote", "div", "dl", "fieldset", "form", "h1",
+    "h2", "h3", "h4", "h5", "h6", "hr", "noscript", "ol", "p", "pre",
+    "table", "ul"
 ])
 
 # F u n c t i o n s
@@ -211,7 +212,7 @@
         # only a matching end tag gets us out of the raw state
         if ws[pos] == '<' and ws[pos:end].lower() == self._endtag and (not ws[end].isalnum()):
             self._ltpos = pos
-            self._state = self._norm if self._endtag == "