From nobody Mon Feb 9 13:01:49 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB098239E75 for ; Mon, 19 Jan 2026 21:06:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768856812; cv=none; b=ntt4EdSImxmaxmZBCQr3v8qGKLdfa5FtndOQ/5E84kKsHjYTYPMRtT7zqbOqZN49b9pgGPSv2Mbphl2TvLRHf/sGmHP1nJAHZfBsnbcqnwZSvu3gncZ1EPSZZek92PqSoiFZncy5mEqfh46S1l9NcEIkfqYuP7SrlZfU3MbgyGs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768856812; c=relaxed/simple; bh=Cv7xdVb8TPBh5SD/przOy40ZShHXp7REu15vrs83Fpk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K0SCdE6HN8pawQ4pZJr7NS+iMl8uCerTJvKvJ++2BbE4kOYAPvHMLM5OcwQ6o5B7LDXZ8IC41Yuk3ZVAa3nBKyuFuiTeNlw7Xl+9dpyYdxNiz8+MJdw2Go5mBsPBVnnizNoZGfy0CA1g9Jk9JMAxC+e9uJi04ezHoLUZ9H8PbMU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gEn+kGp3; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gEn+kGp3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1768856810; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NAhJpBpuwY85AzW6pQHTpyADsq9b83OtuK/qOzl5+7M=; b=gEn+kGp3ZY0QEE4zi/hbWVNnaILHQayS8sEeaCdWFaMlBEumzINY5z6TvLuajP+CMi1AIh gedyuI7XBKmscldswr35EQcL3HNXkTkEIvRO/iwKX1Gb9a2RKEHyi79eE60gFnbSA26jdq bxQjOq0YrMSSc/y1Vp4zAASZy1sHsyI= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-515-BrduNxfKPBGLa-p8j_kXqA-1; Mon, 19 Jan 2026 16:06:46 -0500 X-MC-Unique: BrduNxfKPBGLa-p8j_kXqA-1 X-Mimecast-MFC-AGG-ID: BrduNxfKPBGLa-p8j_kXqA_1768856805 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CC8B01956048; Mon, 19 Jan 2026 21:06:45 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.80.137]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4DC6319560AB; Mon, 19 Jan 2026 21:06:43 +0000 (UTC) From: Wander Lairson Costa To: Steven Rostedt , Gabriele Monaco , Nam Cao , Wander Lairson Costa , linux-kernel@vger.kernel.org (open list), linux-trace-kernel@vger.kernel.org (open list:RUNTIME VERIFICATION (RV)) Subject: [PATCH 20/26] rv/rvgen: refactor automata.py to use iterator-based parsing Date: Mon, 19 Jan 2026 17:45:56 -0300 Message-ID: <20260119205601.105821-21-wander@redhat.com> In-Reply-To: <20260119205601.105821-1-wander@redhat.com> References: <20260119205601.105821-1-wander@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Refactor the DOT file parsing logic in automata.py to use Python's iterator-based patterns instead of manual cursor indexing. The previous implementation relied on while loops with explicit cursor management, which made the code prone to off-by-one errors and would crash on malformed input files containing empty lines. The new implementation uses enumerate and itertools.islice to iterate over lines, eliminating manual cursor tracking. Functions that search for specific markers now use for loops with early returns and explicit AutomataError exceptions for missing markers, rather than assuming the markers exist. Additional bounds checking ensures that split line arrays have sufficient elements before accessing specific indices, preventing IndexError exceptions on malformed DOT files. The matrix creation and event variable extraction methods now use functional patterns with map combined with itertools.islice, making the intent clearer while maintaining the same behavior. Minor improvements include using extend instead of append in a loop, adding empty file validation, and replacing enumerate with range where the enumerated value was unused. Signed-off-by: Wander Lairson Costa Reviewed-by: Gabriele Monaco --- tools/verification/rvgen/rvgen/automata.py | 109 +++++++++++++-------- 1 file changed, 67 insertions(+), 42 deletions(-) diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verificatio= n/rvgen/rvgen/automata.py index 083d0f5cfb773..a6889d0c26c3f 100644 --- a/tools/verification/rvgen/rvgen/automata.py +++ b/tools/verification/rvgen/rvgen/automata.py @@ -9,6 +9,7 @@ # Documentation/trace/rv/deterministic_automata.rst =20 import ntpath +from itertools import islice =20 =20 class AutomataError(OSError): @@ -53,37 +54,54 @@ class Automata: return model_name =20 def __open_dot(self): - cursor =3D 0 dot_lines =3D [] try: with open(self.__dot_path) as dot_file: - dot_lines =3D dot_file.read().splitlines() + dot_lines =3D dot_file.readlines() except OSError as exc: raise AutomataError(f"Cannot open the file: {self.__dot_path}"= ) from exc =20 + if not dot_lines: + raise AutomataError(f"{self.__dot_path} is empty") + # checking the first line: - line =3D dot_lines[cursor].split() + line =3D dot_lines[0].split() =20 - if (line[0] !=3D "digraph") or (line[1] !=3D "state_automaton"): + if len(line) < 2 or line[0] !=3D "digraph" or line[1] !=3D "state_= automaton": raise AutomataError(f"Not a valid .dot format: {self.__dot_pat= h}") - else: - cursor +=3D 1 + return dot_lines =20 def __get_cursor_begin_states(self): - cursor =3D 0 - while self.__dot_lines[cursor].split()[0] !=3D "{node": - cursor +=3D 1 - return cursor + for cursor, line in enumerate(self.__dot_lines): + split_line =3D line.split() + + if len(split_line) and split_line[0] =3D=3D "{node": + return cursor + + raise AutomataError("Could not find a beginning state") =20 def __get_cursor_begin_events(self): - cursor =3D 0 - while self.__dot_lines[cursor].split()[0] !=3D "{node": - cursor +=3D 1 - while self.__dot_lines[cursor].split()[0] =3D=3D "{node": - cursor +=3D 1 - # skip initial state transition - cursor +=3D 1 + state =3D 0 + cursor =3D 0 # make pyright happy + + for cursor, line in enumerate(self.__dot_lines): + line =3D line.split() + if not line: + continue + + if state =3D=3D 0: + if line[0] =3D=3D "{node": + state =3D 1 + elif line[0] !=3D "{node": + break + else: + raise AutomataError("Could not find beginning event") + + cursor +=3D 1 # skip initial state transition + if cursor =3D=3D len(self.__dot_lines): + raise AutomataError("Dot file ended after event beginning") + return cursor =20 def __get_state_variables(self): @@ -96,9 +114,12 @@ class Automata: cursor =3D self.__get_cursor_begin_states() =20 # process nodes - while self.__dot_lines[cursor].split()[0] =3D=3D "{node": - line =3D self.__dot_lines[cursor].split() - raw_state =3D line[-1] + for line in islice(self.__dot_lines, cursor, None): + split_line =3D line.split() + if not split_line or split_line[0] !=3D "{node": + break + + raw_state =3D split_line[-1] =20 # "enabled_fired"}; -> enabled_fired state =3D raw_state.replace('"', "").replace("};", "").replace= (",", "_") @@ -106,16 +127,14 @@ class Automata: initial_state =3D state[len(self.init_marker) :] else: states.append(state) - if "doublecircle" in self.__dot_lines[cursor]: + if "doublecircle" in line: final_states.append(state) has_final_states =3D True =20 - if "ellipse" in self.__dot_lines[cursor]: + if "ellipse" in line: final_states.append(state) has_final_states =3D True =20 - cursor +=3D 1 - if initial_state is None: raise AutomataError("The automaton doesn't have a initial stat= e") =20 @@ -130,26 +149,27 @@ class Automata: return states, initial_state, final_states =20 def __get_event_variables(self): + events: list[str] =3D [] # here we are at the begin of transitions, take a note, we will re= turn later. cursor =3D self.__get_cursor_begin_events() =20 - events =3D [] - while self.__dot_lines[cursor].lstrip()[0] =3D=3D '"': + for line in map(str.lstrip, islice(self.__dot_lines, cursor, None)= ): + if not line.startswith('"'): + break + # transitions have the format: # "all_fired" -> "both_fired" [ label =3D "disable_irq" ]; # ------------ event is here ------------^^^^^ - if self.__dot_lines[cursor].split()[1] =3D=3D "->": - line =3D self.__dot_lines[cursor].split() - event =3D line[-2].replace('"', "") + split_line =3D line.split() + if len(split_line) > 1 and split_line[1] =3D=3D "->": + event =3D split_line[-2].replace('"', "") =20 # when a transition has more than one labels, they are lik= e this # "local_irq_enable\nhw_local_irq_enable_n" # so split them. =20 event =3D event.replace("\\n", " ") - for i in event.split(): - events.append(i) - cursor +=3D 1 + events.extend(event.split()) =20 return sorted(set(events)) =20 @@ -171,32 +191,37 @@ class Automata: =20 # declare the matrix.... matrix =3D [ - [self.invalid_state_str for x in range(nr_event)] for y in ran= ge(nr_state) + [self.invalid_state_str for _ in range(nr_event)] for _ in ran= ge(nr_state) ] =20 # and we are back! Let's fill the matrix cursor =3D self.__get_cursor_begin_events() =20 - while self.__dot_lines[cursor].lstrip()[0] =3D=3D '"': - if self.__dot_lines[cursor].split()[1] =3D=3D "->": - line =3D self.__dot_lines[cursor].split() - origin_state =3D line[0].replace('"', "").replace(",", "_") - dest_state =3D line[2].replace('"', "").replace(",", "_") - possible_events =3D line[-2].replace('"', "").replace("\\n= ", " ") + for line in map(str.lstrip, + islice(self.__dot_lines, cursor, None)): + + if not line or line[0] !=3D '"': + break + + split_line =3D line.split() + + if len(split_line) > 2 and split_line[1] =3D=3D "->": + origin_state =3D split_line[0].replace('"', "").replace(",= ", "_") + dest_state =3D split_line[2].replace('"', "").replace(",",= "_") + possible_events =3D split_line[-2].replace('"', "").replac= e("\\n", " ") for event in possible_events.split(): matrix[states_dict[origin_state]][events_dict[event]] = =3D dest_state - cursor +=3D 1 =20 return matrix =20 def __store_init_events(self): events_start =3D [False] * len(self.events) events_start_run =3D [False] * len(self.events) - for i, _ in enumerate(self.events): + for i in range(len(self.events)): curr_event_will_init =3D 0 curr_event_from_init =3D False curr_event_used =3D 0 - for j, _ in enumerate(self.states): + for j in range(len(self.states)): if self.function[j][i] !=3D self.invalid_state_str: curr_event_used +=3D 1 if self.function[j][i] =3D=3D self.initial_state: --=20 2.52.0