From nobody Mon Sep 16 19:16:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=patchew-devel-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=patchew-devel-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1538146736538933.0679927585197; Fri, 28 Sep 2018 07:58:56 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3AC3430001E0; Fri, 28 Sep 2018 14:58:55 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2568665940; Fri, 28 Sep 2018 14:58:55 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 164764BB74; Fri, 28 Sep 2018 14:58:55 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w8SEwrdF026126 for ; Fri, 28 Sep 2018 10:58:54 -0400 Received: by smtp.corp.redhat.com (Postfix) id EC0C765940; Fri, 28 Sep 2018 14:58:53 +0000 (UTC) Received: from mx1.redhat.com (ext-mx18.extmail.prod.ext.phx2.redhat.com [10.5.110.47]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E24665C1B5 for ; Fri, 28 Sep 2018 14:58:53 +0000 (UTC) Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2D5A730820D0 for ; Fri, 28 Sep 2018 14:58:52 +0000 (UTC) Received: by mail-wr1-f48.google.com with SMTP id j15-v6so6716195wrt.8 for ; Fri, 28 Sep 2018 07:58:52 -0700 (PDT) Received: from donizetti.par1.mozilla.com ([2a00:8c40:243:232:de:fba5:535e:2dc2]) by smtp.gmail.com with ESMTPSA id h73-v6sm5278053wma.11.2018.09.28.07.58.49 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 28 Sep 2018 07:58:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:date:message-id:in-reply-to:references; bh=5/YUTceqDjvKsvQEj+/RhzvjaRPZOb6L7mtHA6UKHmc=; b=gHkBrzUHwG5H6XK2MK7p3sxWNLwCS7UxmggWQ/s0EEbGbjjOTGcPOa3fmynra9Y+TS I7TMmz/GJ2Y/2xtdp6/vozSTYZZBSnGfdJfWq9Harj5mZkXkqBoKoHpJh34mg7v4JnfD qtUn8pVC2mOQ3CizpQYlP5bsEOEfUAZ4GKF6Tvh1wfavl0Ha9y5UjQilcuNl3PC3zFtJ KkFKzWTldkpHgT1nO3cEZGNq82KFYCoTVOV06hYtxDhg85pJ/9pCa5/3qZEumJO4/Qsk z7yVuUqtuwE1PM9LA75oQ/SZPhwK6PKQ+ZGqVOoGUsUNJD3Go91k80lLodID58jWvIzE eMSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references; bh=5/YUTceqDjvKsvQEj+/RhzvjaRPZOb6L7mtHA6UKHmc=; b=pPK82cy0Llrm05bJm5SSBNrokoPzn0OlbAWwT0S972d9KB+tONVUYwt1PFZsNQo2pd WPWbxCYswkgYo9jM5ciOHof37BVC7xTKOA0knupPPmhf2RUPo0dF1vnz6O8ZSNmr58Gw NOGYwx46cXBcheBd/yIpI5qhVmXRfg3ef1yL0t2E9EmBIKQ4jb7pQGr8lERQVYkbT2hZ iEZWDTxzCUUmPcP99VYjHji8tJJK7sozu4+UqF8oLmNs8asPhlz8ysR2XClB7E6lgVcn zOZ/GgzW3Fw2VTLGL+pOLH+n3ZHCm5OyZuRxbMczfo6K1bWYWzq9tPy/1hahWB50A1Yi 34Hw== X-Gm-Message-State: ABuFfojSwR38uscSkm64ANPtz8idkQ08IkB11yh0kOwPS0eXkAixhQLL N3exlNEZxl57uR7UlgwFnc6IsY6js4Q= X-Google-Smtp-Source: ACcGV62q3kUaevK/E2QteU5L0d9hx6PN2/ArTsWwiSXjY629bI1EhVGlw24D9jI4mJYcvMVelvLjTA== X-Received: by 2002:a5d:47c1:: with SMTP id l1-v6mr12807232wrs.188.1538146730488; Fri, 28 Sep 2018 07:58:50 -0700 (PDT) From: Paolo Bonzini To: patchew-devel@redhat.com Date: Fri, 28 Sep 2018 16:58:44 +0200 Message-Id: <20180928145845.20473-4-pbonzini@redhat.com> In-Reply-To: <20180928145845.20473-1-pbonzini@redhat.com> References: <20180928145845.20473-1-pbonzini@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Fri, 28 Sep 2018 14:58:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Fri, 28 Sep 2018 14:58:52 +0000 (UTC) for IP:'209.85.221.48' DOMAIN:'mail-wr1-f48.google.com' HELO:'mail-wr1-f48.google.com' FROM:'paolo.bonzini@gmail.com' RCPT:'' X-RedHat-Spam-Score: 0.489 (DKIM_SIGNED, DKIM_VALID, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE, SPF_PASS) 209.85.221.48 mail-wr1-f48.google.com 209.85.221.48 mail-wr1-f48.google.com X-RedHat-Possible-Forgery: Paolo Bonzini X-Scanned-By: MIMEDefang 2.84 on 10.5.110.47 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: patchew-devel@redhat.com Subject: [Patchew-devel] [PATCH v2 3/4] mbox: extract decode_payload and simplify get_body X-BeenThere: patchew-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Patchew development and discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: patchew-devel-bounces@redhat.com Errors-To: patchew-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 28 Sep 2018 14:58:55 +0000 (UTC) X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDMRC_0 RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" We will have to extract the body of the message and process it, so we need = to do the same processing as get_body(). Extract it into a separate function. Signed-off-by: Paolo Bonzini --- mbox.py | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/mbox.py b/mbox.py index 35b2270..47a770f 100644 --- a/mbox.py +++ b/mbox.py @@ -41,6 +41,18 @@ def addr_db_to_rest(obj): else: return {"address": obj[1]} =20 +def decode_payload(m): + payload =3D m.get_payload(decode=3DTrue) + charset =3D m.get_content_charset() + try: + return payload.decode(charset or 'utf-8', errors=3D'replace') + except: + if charset !=3D 'utf-8': + # Still fall back from non-utf-8 to utf-8 + return payload.decode('utf-8') + else: + raise + class MboxMessage(object): """ Helper class to process mbox """ def __init__(self, m): @@ -161,21 +173,11 @@ class MboxMessage(object): return s.intersection(self.get_prefixes(upper=3DTrue)) =20 def get_body(self): - def decode_payload(payload, charset): - try: - return payload.decode(charset or 'utf-8', errors=3D'replac= e') - except: - if charset !=3D 'utf-8': - # Still fall back from non-utf-8 to utf-8 - return payload.decode('utf-8') - else: - raise def _get_message_text(m): payload =3D m.get_payload(decode=3Dnot self._m.is_multipart()) body =3D '' if m.get_content_type() =3D=3D "text/plain": - body =3D decode_payload(m.get_payload(decode=3DTrue), - self._m.get_content_charset()) + body =3D decode_payload(m) elif isinstance(payload, list): for p in payload: body +=3D _get_message_text(p) --=20 2.17.1 _______________________________________________ Patchew-devel mailing list Patchew-devel@redhat.com https://www.redhat.com/mailman/listinfo/patchew-devel