# HG changeset patch # User David Barts # Date 1577421517 28800 # Node ID da3fb2312c88632d1b498a8c53f5e3bfef764d65 # Parent d5198c7ec54dddeb19c1411f6a9d997d5b773fdc Leave bodies of
 tags alone.

diff -r d5198c7ec54d -r da3fb2312c88 curlers.py
--- a/curlers.py	Thu Dec 26 20:24:32 2019 -0800
+++ b/curlers.py	Thu Dec 26 20:38:37 2019 -0800
@@ -21,13 +21,12 @@
     "'bout", "'nuff", "'round", "'cause" , "'em" ]
 
 # HTML tags that enclose raw data
-_RAW = set(["script", "style"])
+_RAW = set(["script", "style", "pre"])
 
 # HTML block elements
 _BLOCK = set([
-    "address", "blockquote", "div", "dl", "fieldset", "form", "h1",
-    "h2", "h3", "h4", "h5", "h6", "hr", "noscript", "ol", "p", "pre",
-    "table", "ul"
+    "address", "blockquote", "div", "dl", "fieldset", "form", "h1", "h2",
+    "h3", "h4", "h5", "h6", "hr", "noscript", "ol", "p", "table", "ul"
 ])
 
 # F u n c t i o n s
@@ -212,7 +211,7 @@
         # only a matching end tag gets us out of the raw state
         if ws[pos] == '<' and ws[pos:end].lower() == self._endtag and (not ws[end].isalnum()):
             self._ltpos = pos
-            self._state = self._seen_lt
+            self._state = self._norm if self._endtag == "