From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 011A11714A1 for ; Wed, 11 Sep 2024 09:30:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047050; cv=none; b=Q1oq0Fix0u/6ZzmhOOus9PuRWegxaz66/yu4SY0P33Sx/wy2KVWoVh5OGi/L/ECGYVD8zCwymXw1MNaBgK8qleYoNU/rTy9hAKkpu656FaxeJPMqB6CF9p4jhEsEXFcbbmaVVvdaaJs+0kG969E0F8cTyPIV87AMW8vB3RV7I5I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047050; c=relaxed/simple; bh=Q44mqaTM1SIsrs8CgYvP0+W3Utf1yJBCLC/NqYnwdgQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OcJ4hBwq7Y7W7/NUC1tg6Lc9p05BLSN6gIHc4VJGEKtUoqSkVVuDvaaqyG8q4RWPpoWOKki+RDxav+fMvJX02OQilK0ujP0bAWAJ81fBrJ/AbF2mYFPUMnqgpwYoV7mxSa0K4oQRFBu9c/3ixOAfs0ZihAigNQWehZefGGvCtuQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mZNu1TFZ; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mZNu1TFZ" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e1a8eedf001so11935960276.0 for ; Wed, 11 Sep 2024 02:30:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047047; x=1726651847; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HIfiuZeJznqfh2tkLISY1veWrBubnfMuInZigqeP7X0=; b=mZNu1TFZP/sjgle7nJc0LDRXyQye9iEPdzcIxK9TU+Rs8THN/fRQiNCtiuT9BjHn+o ffSbL+RNuX+2pp/36oaJiftRuqOvIpRx8YbwiYHzArvXFPbmFf/x6F9w5p/hxD2IqK08 BkeaORXqm42wpFG15IZdjOEkOL3tlBO5dSEaMunyXHIexQpHokBHxVKeAWh1AL8ISCeY 1JmfyZ3IAkDH1+hCE+rgLt5yX4dQNnaIoVzYU855BcJBoJh8GAa8B72tOT/mbRlO1AAE dnth+JGgVT0TrdLBtGQM23dP5lv4n4sKna+s7pE2qWRWFWdMkSaLBISpbl+Qz30CU+C0 1MJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047047; x=1726651847; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HIfiuZeJznqfh2tkLISY1veWrBubnfMuInZigqeP7X0=; b=heStxUIJyRm5F51H3P7fX8kmSINGVvbg9di9qKMiZ+Qv5biDo6JS3PhT/j4RMI6U1g 80fBU6R2RFVV2gP4Dk2woohX7f4GPIKb9OnlGAFc0W3RkVFZzJtuQQYE5rzp5P8ZxA8e 1zlELzNKdFvrgPvjUG0rI4/tfxHYhcVYz5MHh+vVMNa1m1p9fRk/BTb7lghEVsqTEPGV u8Z8Nu1EK1Y7oXr1oiPSoHy6dMo6zZqs7v3A019EKLNVjBK2gzYIIH+/rIYkB3bWYvhV gdybTKE7vFijOLREe85qmDnAjOTQzN7ODzF2uVGUVhSrSpGYMqI0Ag1Xy9tRQh6Lavbm NYtw== X-Forwarded-Encrypted: i=1; AJvYcCV2SBTLxqH7nQKqAM4v5LgYqvxPyXvvDD5OWoNBNPaRSlz/Zd6iWLatBzls1vipdT0HAuUO4F3/lfCr2xI=@vger.kernel.org X-Gm-Message-State: AOJu0YyX7VBnRIxR7JYrlnAr+fntqv6walOQbgJwbDNMM25+Dda0qRkr lN1dVzGLfgV3ZsNWQiCcnrXEiICWzEKEdPQpQ1FapDfpEtAK1g4CxlUykNH3K1FRwie1jZCJhpM tgfqEP9rxSxZ9aI/v4w== X-Google-Smtp-Source: AGHT+IEtNHaw/88xokAWy+UiHnAdlVPy8NQA0O3wwzz1MkIfIbeu6r+aMsbXBRs/4d+Trw4L1hAPe+ThqsYRmZCe X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a25:9390:0:b0:e16:69e2:2028 with SMTP id 3f1490d57ef6-e1d34879f14mr32927276.3.1726047047060; Wed, 11 Sep 2024 02:30:47 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:17 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-2-vdonnefort@google.com> Subject: [PATCH 01/13] ring-buffer: Check for empty ring-buffer with rb_num_of_entries() From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently there are two ways of identifying an empty ring-buffer. One relying on the current status of the commit / reader page (rb_per_cpu_empty()) and the other on the write and read counters (rb_num_of_entries() used in rb_get_reader_page()). with rb_num_of_entries(). This intends to ease later introduction of ring-buffer writers which are out of the kernel control and with whom, the only information available is through the meta-page counters. Signed-off-by: Vincent Donnefort diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index cebd879a30cb..7abe671effbf 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -3941,40 +3941,22 @@ int ring_buffer_write(struct trace_buffer *buffer, } EXPORT_SYMBOL_GPL(ring_buffer_write); =20 -static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer) +/* + * The total entries in the ring buffer is the running counter + * of entries entered into the ring buffer, minus the sum of + * the entries read from the ring buffer and the number of + * entries that were overwritten. + */ +static inline unsigned long +rb_num_of_entries(struct ring_buffer_per_cpu *cpu_buffer) { - struct buffer_page *reader =3D cpu_buffer->reader_page; - struct buffer_page *head =3D rb_set_head_page(cpu_buffer); - struct buffer_page *commit =3D cpu_buffer->commit_page; - - /* In case of error, head will be NULL */ - if (unlikely(!head)) - return true; - - /* Reader should exhaust content in reader page */ - if (reader->read !=3D rb_page_size(reader)) - return false; - - /* - * If writers are committing on the reader page, knowing all - * committed content has been read, the ring buffer is empty. - */ - if (commit =3D=3D reader) - return true; - - /* - * If writers are committing on a page other than reader page - * and head page, there should always be content to read. - */ - if (commit !=3D head) - return false; + return local_read(&cpu_buffer->entries) - + (local_read(&cpu_buffer->overrun) + cpu_buffer->read); +} =20 - /* - * Writers are committing on the head page, we just need - * to care about there're committed data, and the reader will - * swap reader page with head page when it is to read data. - */ - return rb_page_commit(commit) =3D=3D 0; +static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer) +{ + return !rb_num_of_entries(cpu_buffer); } =20 /** @@ -4120,19 +4102,6 @@ void ring_buffer_record_enable_cpu(struct trace_buff= er *buffer, int cpu) } EXPORT_SYMBOL_GPL(ring_buffer_record_enable_cpu); =20 -/* - * The total entries in the ring buffer is the running counter - * of entries entered into the ring buffer, minus the sum of - * the entries read from the ring buffer and the number of - * entries that were overwritten. - */ -static inline unsigned long -rb_num_of_entries(struct ring_buffer_per_cpu *cpu_buffer) -{ - return local_read(&cpu_buffer->entries) - - (local_read(&cpu_buffer->overrun) + cpu_buffer->read); -} - /** * ring_buffer_oldest_event_ts - get the oldest event timestamp from the b= uffer * @buffer: The ring buffer --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42701178388 for ; Wed, 11 Sep 2024 09:30:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047052; cv=none; b=kRTWRFDQzpxiCHyFeTpskXy0DPdgC1/5tDpEkYyy1nM2bnWQCGKlisF6zSWUDQX/O94i04ZOzbvtIgOKYBE1UlZXiPRqwk+VMhaJORihkR7olTA38XNMrKKz1WwDh7OGOK16O/iZBATAfvBz2J6QRWGJNjZ/AC60FFVUVm22LcQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047052; c=relaxed/simple; bh=rdESoqDkV6dvzVoBnAnk5bpIBGdwqcBkBRK9Adlytzg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=o+wqwrBYccHj3gzwAPrQ4j56sRSaBd5QcU4T2hnSkeEOsgw0g9UsRg9o7KdXW/KB1OQAgv4Xrb7MT4AdsMrqbtG3EExS0znftHHsWePjALfmy4L/iAR+6AMb2K82j5Jc37feOyL0YBodEN7uBa44uQcumF7s17zp3MVBfs4vEhY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WZkJxTKM; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WZkJxTKM" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6db70bf67faso104457037b3.0 for ; Wed, 11 Sep 2024 02:30:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047049; x=1726651849; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bX7fRzDpnNd4zeQVzomecGzzlbCGx2yH+IA4Wh3YhHI=; b=WZkJxTKMtvFv8/wmlR19q1BHVIZJ/b1XVrFspxibOL2GV+oq6Pke/H65jckWPONO+i xraBY94KvIfaAJ8a3IqQwW85Vymx3YfaZwSoCrqRUBqM5Y8Q8DnAZRRoyCrAVcoXpCi1 mLRdaXQcg2tiJ2MNw/NDoDd45Q1L+6xvfk6vVyqiTT/m+4VheMHB94JvyzPIvq4s+7D6 bpiPaeTQf3bFZAB82vEGdopyO8tT8dyQIGgRVznJxqvTQ1eN+bT69uTL+vLFeFJZh+R+ +30P1XAw2P0DrLaR0YQIorR8Vz2pMnpIP8LmH7O/0iu9vKxc8fGhKV6xDKYPTbAEXosy kPJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047049; x=1726651849; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bX7fRzDpnNd4zeQVzomecGzzlbCGx2yH+IA4Wh3YhHI=; b=Q8WoZbrDhp4zumisrs4oLAbc5VepeSJaWCuLsERvPMZMHOJ8tM5FaLeaMc7wvCWalR ptszSA6WOmhl0trliakrInbRpMYDwNVMo7oLMuOKnUS8xhnxp+m8TwDN8vY+sE0GQyTj o1gNZgWLpO+ntMpjY2QAou1XnVMeBSC7dSpWnbkOdnUcbSVomJ9Vi57PKPHiWMld6bf4 FTSSWWRiiF/DC15ANr810W2d5Nx7KSmIE5q1HKVc/AuwkVQgZlmJ6rsJMHkrEMBC6gDJ sOn1F6zvm29k+yuFtNwxzVQ1E9yPPh1gb5s6PdmWqw8xHrdUa6LDQoR4T2cRsrIZAybX Lx7w== X-Forwarded-Encrypted: i=1; AJvYcCXp12Nypg+Ya6KOmN4JI7EiFtkudiG0hlglqgTxGQdmIfq2K12VFv2Nkq67ymfYbGNfTV3LP8su1iYC50Y=@vger.kernel.org X-Gm-Message-State: AOJu0YwSSfqfRoB+gmr8pAvfYWX1TVyjHgjt829pASyLlNAlvnOv/Gwb sqdpohOOwXyCTqpJe07DOTXgyKd0bGcETph4UwcMqlOcVCelgU77Pnr3XgH4gZz3L55ECVJlB5q 6SDLghhood1sSGcSGyg== X-Google-Smtp-Source: AGHT+IHr95mENntI+XL7TLPinDZ+fi2cz1xAO3aClhDUfpqjW1lhYGfcAWmsh0sAFKdZEZ2hNYgSoyjFvy43lESc X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:690c:38c:b0:6af:623c:7694 with SMTP id 00721157ae682-6db44a5d200mr11834517b3.0.1726047049319; Wed, 11 Sep 2024 02:30:49 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:18 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-3-vdonnefort@google.com> Subject: [PATCH 02/13] ring-buffer: Introducing ring-buffer writer From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A ring-buffer writer is an entity outside of the kernel (most likely a firmware or a hypervisor) capable of writing events in a ring-buffer following the same format as the tracefs ring-buffer. To setup the ring-buffer on the kernel side, a description of the pages (struct trace_page_desc) is necessary. A callback (get_reader_page) must also be provided. It is called whenever it is done reading the previous reader page. It is expected from the writer to keep the meta-page updated. Signed-off-by: Vincent Donnefort diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index fd35d4ec12e1..d78a33b3c96e 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -83,21 +83,24 @@ u64 ring_buffer_event_time_stamp(struct trace_buffer *b= uffer, void ring_buffer_discard_commit(struct trace_buffer *buffer, struct ring_buffer_event *event); =20 +struct ring_buffer_writer; + /* * size is in bytes for each per CPU buffer. */ struct trace_buffer * -__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_= key *key); +__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_= key *key, + struct ring_buffer_writer *writer); =20 /* * Because the ring buffer is generic, if other users of the ring buffer g= et * traced by ftrace, it can produce lockdep warnings. We need to keep each * ring buffer's lock class separate. */ -#define ring_buffer_alloc(size, flags) \ -({ \ - static struct lock_class_key __key; \ - __ring_buffer_alloc((size), (flags), &__key); \ +#define ring_buffer_alloc(size, flags) \ +({ \ + static struct lock_class_key __key; \ + __ring_buffer_alloc((size), (flags), &__key, NULL); \ }) =20 typedef bool (*ring_buffer_cond_fn)(void *data); @@ -228,4 +231,54 @@ int ring_buffer_map(struct trace_buffer *buffer, int c= pu, struct vm_area_struct *vma); int ring_buffer_unmap(struct trace_buffer *buffer, int cpu); int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu); + +#define meta_pages_lost(__meta) \ + ((__meta)->Reserved1) +#define meta_pages_touched(__meta) \ + ((__meta)->Reserved2) + +struct rb_page_desc { + int cpu; + int nr_page_va; /* exclude the meta page */ + unsigned long meta_va; + unsigned long page_va[]; +}; + +struct trace_page_desc { + int nr_cpus; + char __data[]; /* list of rb_page_desc */ +}; + +static inline +struct rb_page_desc *__next_rb_page_desc(struct rb_page_desc *pdesc) +{ + size_t len =3D struct_size(pdesc, page_va, pdesc->nr_page_va); + + return (struct rb_page_desc *)((void *)pdesc + len); +} + +static inline +struct rb_page_desc *__first_rb_page_desc(struct trace_page_desc *trace_pd= esc) +{ + return (struct rb_page_desc *)(&trace_pdesc->__data[0]); +} + +#define for_each_rb_page_desc(__pdesc, __cpu, __trace_pdesc) \ + for (__pdesc =3D __first_rb_page_desc(__trace_pdesc), __cpu =3D 0; \ + __cpu < (__trace_pdesc)->nr_cpus; \ + __cpu++, __pdesc =3D __next_rb_page_desc(__pdesc)) + +struct ring_buffer_writer { + struct trace_page_desc *pdesc; + int (*get_reader_page)(int cpu); + int (*reset)(int cpu); +}; + +int ring_buffer_poll_writer(struct trace_buffer *buffer, int cpu); + +#define ring_buffer_reader(writer) \ +({ \ + static struct lock_class_key __key; \ + __ring_buffer_alloc(0, RB_FL_OVERWRITE, &__key, writer);\ +}) #endif /* _LINUX_RING_BUFFER_H */ diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7abe671effbf..b05b7a95e3f1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -495,6 +495,8 @@ struct ring_buffer_per_cpu { unsigned long *subbuf_ids; /* ID to subbuf VA */ struct trace_buffer_meta *meta_page; =20 + struct ring_buffer_writer *writer; + /* ring buffer pages to update, > 0 to add, < 0 to remove */ long nr_pages_to_update; struct list_head new_pages; /* new pages to add */ @@ -517,6 +519,8 @@ struct trace_buffer { =20 struct ring_buffer_per_cpu **buffers; =20 + struct ring_buffer_writer *writer; + struct hlist_node node; u64 (*clock)(void); =20 @@ -1555,6 +1559,42 @@ static int __rb_allocate_pages(struct ring_buffer_pe= r_cpu *cpu_buffer, return -ENOMEM; } =20 +static struct rb_page_desc *rb_page_desc(struct trace_page_desc *trace_pde= sc, + int cpu) +{ + struct rb_page_desc *pdesc; + size_t len; + int i; + + if (!trace_pdesc) + return NULL; + + if (cpu >=3D trace_pdesc->nr_cpus) + return NULL; + + pdesc =3D __first_rb_page_desc(trace_pdesc); + len =3D struct_size(pdesc, page_va, pdesc->nr_page_va); + pdesc +=3D len * cpu; + + if (pdesc->cpu =3D=3D cpu) + return pdesc; + + /* Missing CPUs, need to linear search */ + + for_each_rb_page_desc(pdesc, i, trace_pdesc) { + if (pdesc->cpu =3D=3D cpu) + return pdesc; + } + + return NULL; +} + +static void *rb_page_desc_page(struct rb_page_desc *pdesc, int page_id) +{ + return page_id > pdesc->nr_page_va ? NULL : (void *)pdesc->page_va[page_i= d]; +} + + static int rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) { @@ -1614,6 +1654,31 @@ rb_allocate_cpu_buffer(struct trace_buffer *buffer, = long nr_pages, int cpu) =20 cpu_buffer->reader_page =3D bpage; =20 + if (buffer->writer) { + struct rb_page_desc *pdesc =3D rb_page_desc(buffer->writer->pdesc, cpu); + + if (!pdesc) + goto fail_free_reader; + + cpu_buffer->writer =3D buffer->writer; + cpu_buffer->meta_page =3D (struct trace_buffer_meta *)(void *)pdesc->met= a_va; + cpu_buffer->subbuf_ids =3D pdesc->page_va; + cpu_buffer->nr_pages =3D pdesc->nr_page_va - 1; + atomic_inc(&cpu_buffer->record_disabled); + atomic_inc(&cpu_buffer->resize_disabled); + + bpage->page =3D rb_page_desc_page(pdesc, + cpu_buffer->meta_page->reader.id); + if (!bpage->page) + goto fail_free_reader; + /* + * The meta-page can only describe which of the ring-buffer page + * is the reader. There is no need to init the rest of the + * ring-buffer. + */ + return cpu_buffer; + } + page =3D alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL | __GFP_COMP | __G= FP_ZERO, cpu_buffer->buffer->subbuf_order); if (!page) @@ -1651,6 +1716,10 @@ static void rb_free_cpu_buffer(struct ring_buffer_pe= r_cpu *cpu_buffer) =20 irq_work_sync(&cpu_buffer->irq_work.work); =20 + /* ring_buffers with writer set do not own the data pages */ + if (cpu_buffer->writer) + cpu_buffer->reader_page->page =3D NULL; + free_buffer_page(cpu_buffer->reader_page); =20 if (head) { @@ -1681,7 +1750,8 @@ static void rb_free_cpu_buffer(struct ring_buffer_per= _cpu *cpu_buffer) * drop data when the tail hits the head. */ struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flag= s, - struct lock_class_key *key) + struct lock_class_key *key, + struct ring_buffer_writer *writer) { struct trace_buffer *buffer; long nr_pages; @@ -1709,6 +1779,11 @@ struct trace_buffer *__ring_buffer_alloc(unsigned lo= ng size, unsigned flags, buffer->flags =3D flags; buffer->clock =3D trace_clock_local; buffer->reader_lock_key =3D key; + if (writer) { + buffer->writer =3D writer; + /* The writer is external and never done by the kernel */ + atomic_inc(&buffer->record_disabled); + } =20 init_irq_work(&buffer->irq_work.work, rb_wake_up_waiters); init_waitqueue_head(&buffer->irq_work.waiters); @@ -4456,8 +4531,54 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *i= ter, } } =20 +static bool rb_read_writer_meta_page(struct ring_buffer_per_cpu *cpu_buffe= r) +{ + local_set(&cpu_buffer->entries, READ_ONCE(cpu_buffer->meta_page->entries)= ); + local_set(&cpu_buffer->overrun, READ_ONCE(cpu_buffer->meta_page->overrun)= ); + local_set(&cpu_buffer->pages_touched, READ_ONCE(meta_pages_touched(cpu_bu= ffer->meta_page))); + local_set(&cpu_buffer->pages_lost, READ_ONCE(meta_pages_lost(cpu_buffer->= meta_page))); + /* + * No need to get the "read" field, it can be tracked here as any + * reader will have to go through a rign_buffer_per_cpu. + */ + + return rb_num_of_entries(cpu_buffer); +} + static struct buffer_page * -rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) +__rb_get_reader_page_from_writer(struct ring_buffer_per_cpu *cpu_buffer) +{ + u32 prev_reader; + + if (!rb_read_writer_meta_page(cpu_buffer)) + return NULL; + + /* More to read on the reader page */ + if (cpu_buffer->reader_page->read < rb_page_size(cpu_buffer->reader_page)) + return cpu_buffer->reader_page; + + prev_reader =3D cpu_buffer->meta_page->reader.id; + + WARN_ON(cpu_buffer->writer->get_reader_page(cpu_buffer->cpu)); + /* nr_pages doesn't include the reader page */ + if (cpu_buffer->meta_page->reader.id > cpu_buffer->nr_pages) { + WARN_ON(1); + return NULL; + } + + cpu_buffer->reader_page->page =3D + (void *)cpu_buffer->subbuf_ids[cpu_buffer->meta_page->reader.id]; + cpu_buffer->reader_page->read =3D 0; + cpu_buffer->read_stamp =3D cpu_buffer->reader_page->page->time_stamp; + cpu_buffer->lost_events =3D cpu_buffer->meta_page->reader.lost_events; + + WARN_ON(prev_reader =3D=3D cpu_buffer->meta_page->reader.id); + + return cpu_buffer->reader_page; +} + +static struct buffer_page * +__rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) { struct buffer_page *reader =3D NULL; unsigned long bsize =3D READ_ONCE(cpu_buffer->buffer->subbuf_size); @@ -4624,6 +4745,13 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_b= uffer) return reader; } =20 +static struct buffer_page * +rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) +{ + return cpu_buffer->writer ? __rb_get_reader_page_from_writer(cpu_buffer) : + __rb_get_reader_page(cpu_buffer); +} + static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer) { struct ring_buffer_event *event; @@ -5028,7 +5156,7 @@ ring_buffer_read_prepare(struct trace_buffer *buffer,= int cpu, gfp_t flags) struct ring_buffer_per_cpu *cpu_buffer; struct ring_buffer_iter *iter; =20 - if (!cpumask_test_cpu(cpu, buffer->cpumask)) + if (!cpumask_test_cpu(cpu, buffer->cpumask) || buffer->writer) return NULL; =20 iter =3D kzalloc(sizeof(*iter), flags); @@ -5198,6 +5326,22 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) { struct buffer_page *page; =20 + if (cpu_buffer->writer) { + if (!cpu_buffer->writer->reset) + return; + + cpu_buffer->writer->reset(cpu_buffer->cpu); + rb_read_writer_meta_page(cpu_buffer); + + /* Read related values, not covered by the meta-page */ + local_set(&cpu_buffer->pages_read, 0); + cpu_buffer->read =3D 0; + cpu_buffer->read_bytes =3D 0; + cpu_buffer->last_overrun =3D 0; + + return; + } + rb_head_page_deactivate(cpu_buffer); =20 cpu_buffer->head_page @@ -5428,6 +5572,49 @@ bool ring_buffer_empty_cpu(struct trace_buffer *buff= er, int cpu) } EXPORT_SYMBOL_GPL(ring_buffer_empty_cpu); =20 +int ring_buffer_poll_writer(struct trace_buffer *buffer, int cpu) +{ + struct ring_buffer_per_cpu *cpu_buffer; + unsigned long flags; + + if (cpu !=3D RING_BUFFER_ALL_CPUS) { + if (!cpumask_test_cpu(cpu, buffer->cpumask)) + return -EINVAL; + + cpu_buffer =3D buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + if (rb_read_writer_meta_page(cpu_buffer)) + rb_wakeups(buffer, cpu_buffer); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + + return 0; + } + + /* + * Make sure all the ring buffers are up to date before we start reading + * them. + */ + for_each_buffer_cpu(buffer, cpu) { + cpu_buffer =3D buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + rb_read_writer_meta_page(buffer->buffers[cpu]); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + } + + for_each_buffer_cpu(buffer, cpu) { + cpu_buffer =3D buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + if (rb_num_of_entries(cpu_buffer)) + rb_wakeups(buffer, buffer->buffers[cpu]); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + } + + return 0; +} + #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP /** * ring_buffer_swap_cpu - swap a CPU buffer between two ring buffers @@ -5679,6 +5866,7 @@ int ring_buffer_read_page(struct trace_buffer *buffer, unsigned int commit; unsigned int read; u64 save_timestamp; + bool force_memcpy; int ret =3D -1; =20 if (!cpumask_test_cpu(cpu, buffer->cpumask)) @@ -5716,6 +5904,8 @@ int ring_buffer_read_page(struct trace_buffer *buffer, /* Check if any events were dropped */ missed_events =3D cpu_buffer->lost_events; =20 + force_memcpy =3D cpu_buffer->mapped || cpu_buffer->writer; + /* * If this page has been partially read or * if len is not big enough to read the rest of the page or @@ -5725,7 +5915,7 @@ int ring_buffer_read_page(struct trace_buffer *buffer, */ if (read || (len < (commit - read)) || cpu_buffer->reader_page =3D=3D cpu_buffer->commit_page || - cpu_buffer->mapped) { + force_memcpy) { struct buffer_data_page *rpage =3D cpu_buffer->reader_page->page; unsigned int rpos =3D read; unsigned int pos =3D 0; @@ -6278,7 +6468,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int = cpu, unsigned long flags, *subbuf_ids; int err =3D 0; =20 - if (!cpumask_test_cpu(cpu, buffer->cpumask)) + if (!cpumask_test_cpu(cpu, buffer->cpumask) || buffer->writer) return -EINVAL; =20 cpu_buffer =3D buffer->buffers[cpu]; --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89C2F183088 for ; Wed, 11 Sep 2024 09:30:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047054; cv=none; b=YvxuwssV4qcYOnVeKTN+JLv4f+7ihh/SsHDsYy8pyrf/2Eha6+8YRHhrQ5F5KvgT7tIQA85hnkRFbyLyOYG2PFdUQIpSrGMrMg+KjPOzd0vwpnKzaddBDAITi2r2tUYUuGOswy89OPbldTX3rmFSgTpSJiJHzvLZ36cGwHaPc2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047054; c=relaxed/simple; bh=2o7ABM3QhyafkTSh35kuX9UtTd+ZQf5InDvS+wtHIQ8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NBfYlS264HvHWpUN/hoK4KyFS5xAEEtXgap19ELunLab7bv3LUCiAMQNsIxYHY6rcnC5nALC9E8cDx5tnAkyuAocqwAx9GqNehBUeALK97wVfzj0Mt/svm0x2W6nVmF7mOL7emRSEUfzB3UQP3jw4KpW/AlGMd/JDmhmxQSeot8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kcfMOEgL; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kcfMOEgL" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6b41e02c255so210284307b3.3 for ; Wed, 11 Sep 2024 02:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047051; x=1726651851; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=c2kp1u04jgRsPeEEJGo1+VXzlnKmMlFIctyiIU3pgM0=; b=kcfMOEgLblASjPNZKWrJy5xe2tTXjw2LA95KgGHWnvl4NDq1pwsxv+UI76WTyoIkri PraYU4KFwNN+zSi2cOJv5sm4rKKz44vfusUfQHCbAALUkz/Woqhzr1DjU+XIZO1P1KxE 6xbDYuwUJ6NHS96MElMbWSF3oTUG7oMw2V2BNV+6KTsVI8oOUDbA2fHM8C5nooVTwMlD RFC+r3EiDBMmmjKoLaSi+1LCS130qf9LpfTlTqI3Luf+ISBGnDzLBSUmZOfDjxdd90rR a6wQl0wQTu9tPJW21bOFH6iUa094J8hL98TLrxUzzGBkyiuo3+lRqggCPrPMV6zlectw 0FcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047051; x=1726651851; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=c2kp1u04jgRsPeEEJGo1+VXzlnKmMlFIctyiIU3pgM0=; b=F2K+3CDpzz6MY7cAT53nalD4C16DyxKctPUBE95xzsmsCU7QOzlCCf20QApbQBjYoT 3cTad4JRBgvCQm9Dw/udcA+WlecqlXlHNw6l10yY6+uuq3SxPk9S2brg3/BwaIRQv8v0 gPH8titnRuvsVPs6/2Z16cCKUTtWTh+oclJTU94oL7GDlRLvA5fJikC6Jz5PiHtB//wR xLw8UFeErmtHI+RaTWerB0J7N0n0DzAaFqlit88hYWI4CP7YZufb+AxnsdMwfoxCUdsh bfudYdUKkL0fKA33/l1f/Cwy0VFwDC3mObzIFNn5p64YHHVxi5R8mnVXN+nB2rpJ+Au/ EKOQ== X-Forwarded-Encrypted: i=1; AJvYcCWtd/Flfv7cSQ/O0C2Ldt9oHYdouwWtXeIRSGKm5hJFcCoMoXrwKWUUiDJQ4IXl31OnFSKShhh9S4/xF2Y=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+SE9/2jlQFp3mY3rIBxgDl0beASW5UI7Z+FGRWhri4sb9FO+H Qpb4JsQ4axxUWzX9ZKZ3aGXCWv5uIdoiGzhTDCEPWgWCNd1sLNefIFnaM9EzZz099IBfFy12T7q ialRUz2IxyGn1fMu5CA== X-Google-Smtp-Source: AGHT+IEhqc1SMdUsPQ3IEawL8ztptY4mrG6e3VVp1AMINnDH97yStPwv2AAW/lFC8PFwC3knWmJU0TLT0HWYBPsS X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:6902:4e7:b0:e11:5b8c:f9c with SMTP id 3f1490d57ef6-e1d346b227bmr92087276.0.1726047051548; Wed, 11 Sep 2024 02:30:51 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:19 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-4-vdonnefort@google.com> Subject: [PATCH 03/13] ring-buffer: Expose buffer_data_page material From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for allowing the write of ring-buffer compliant pages outside of ring_buffer.c, move to the header, struct buffer_data_page and timestamp encoding functions into the publicly available ring_buffer.h. Signed-off-by: Vincent Donnefort diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index d78a33b3c96e..5e4e770f55cb 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -3,8 +3,10 @@ #define _LINUX_RING_BUFFER_H =20 #include -#include #include +#include + +#include =20 #include =20 @@ -20,6 +22,8 @@ struct ring_buffer_event { u32 array[]; }; =20 +#define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) + /** * enum ring_buffer_type - internal ring buffer types * @@ -61,11 +65,50 @@ enum ring_buffer_type { RINGBUF_TYPE_TIME_STAMP, }; =20 +#define TS_SHIFT 27 +#define TS_MASK ((1ULL << TS_SHIFT) - 1) +#define TS_DELTA_TEST (~TS_MASK) + +/* + * We need to fit the time_stamp delta into 27 bits. + */ +static inline bool test_time_stamp(u64 delta) +{ + return !!(delta & TS_DELTA_TEST); +} + unsigned ring_buffer_event_length(struct ring_buffer_event *event); void *ring_buffer_event_data(struct ring_buffer_event *event); u64 ring_buffer_event_time_stamp(struct trace_buffer *buffer, struct ring_buffer_event *event); =20 +#define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) + +/* Max payload is BUF_PAGE_SIZE - header (8bytes) */ +#define BUF_MAX_DATA_SIZE (BUF_PAGE_SIZE - (sizeof(u32) * 2)) + +#define BUF_PAGE_SIZE (PAGE_SIZE - BUF_PAGE_HDR_SIZE) + +#define RB_ALIGNMENT 4U +#define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) +#define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ + +#ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS +# define RB_FORCE_8BYTE_ALIGNMENT 0 +# define RB_ARCH_ALIGNMENT RB_ALIGNMENT +#else +# define RB_FORCE_8BYTE_ALIGNMENT 1 +# define RB_ARCH_ALIGNMENT 8U +#endif + +#define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) + +struct buffer_data_page { + u64 time_stamp; /* page time stamp */ + local_t commit; /* write committed index */ + unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ +}; + /* * ring_buffer_discard_commit will remove an event that has not * been committed yet. If this is used, then ring_buffer_unlock_commit diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index b05b7a95e3f1..009ea4477327 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -135,23 +135,6 @@ int ring_buffer_print_entry_header(struct trace_seq *s) /* Used for individual buffers (after the counter) */ #define RB_BUFFER_OFF (1 << 20) =20 -#define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) - -#define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) -#define RB_ALIGNMENT 4U -#define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) -#define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ - -#ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS -# define RB_FORCE_8BYTE_ALIGNMENT 0 -# define RB_ARCH_ALIGNMENT RB_ALIGNMENT -#else -# define RB_FORCE_8BYTE_ALIGNMENT 1 -# define RB_ARCH_ALIGNMENT 8U -#endif - -#define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) - /* define RINGBUF_TYPE_DATA for 'case RINGBUF_TYPE_DATA:' */ #define RINGBUF_TYPE_DATA 0 ... RINGBUF_TYPE_DATA_TYPE_LEN_MAX =20 @@ -294,10 +277,6 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data); #define for_each_online_buffer_cpu(buffer, cpu) \ for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask) =20 -#define TS_SHIFT 27 -#define TS_MASK ((1ULL << TS_SHIFT) - 1) -#define TS_DELTA_TEST (~TS_MASK) - static u64 rb_event_time_stamp(struct ring_buffer_event *event) { u64 ts; @@ -316,12 +295,6 @@ static u64 rb_event_time_stamp(struct ring_buffer_even= t *event) =20 #define RB_MISSED_MASK (3 << 30) =20 -struct buffer_data_page { - u64 time_stamp; /* page time stamp */ - local_t commit; /* write committed index */ - unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ -}; - struct buffer_data_read_page { unsigned order; /* order of the page */ struct buffer_data_page *data; /* actual data, stored in this page */ @@ -377,14 +350,6 @@ static void free_buffer_page(struct buffer_page *bpage) kfree(bpage); } =20 -/* - * We need to fit the time_stamp delta into 27 bits. - */ -static inline bool test_time_stamp(u64 delta) -{ - return !!(delta & TS_DELTA_TEST); -} - struct rb_irq_work { struct irq_work work; wait_queue_head_t waiters; --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2294183CCD for ; Wed, 11 Sep 2024 09:30:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047056; cv=none; b=IkfxCe3RuoIyE38NVjYcK/nn9LriDQvF6FVmkg16pKzw9IuZhGBJU4c26M2+DHiOcBlkTlIRpFAGefZ3JTeasAOT+jtzudD8e6xJSKzj8W1QFz3XM9r2xj/cxjd/wzUMaGwpjRmyagyqJ6kHO7/2AS7TNuY99Xqc/sBnQ0Io6TM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047056; c=relaxed/simple; bh=/8SgJehhjAGO88IlldewWFs2KgAfI8cLjbzPRVcr1YU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XpzdFA6zDHRoNTh0Ck0WpHZuauZJGS9K043cPCx/+Pv2HaOMZM8948yvkYKIHjO4Z/vAjOWsAEF1bkbqHShcmsJXcIxMD2OZx92zTF43S1A8I9LdoJAvpMOjL+efZRm0qouutwNQqcCFTioo7FWsRhckbdhC0g/a3NI4fmUBHdg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=juXk96rj; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="juXk96rj" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e1ab008280aso3138955276.0 for ; Wed, 11 Sep 2024 02:30:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047054; x=1726651854; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=oIPB93isriVCRhj0WIuzCaItTxePKQBs+tjSg5k4Ryc=; b=juXk96rjLdzlBxbJvXUtzYIQPtxoUNtYJXSykARutp9hy61RH4PdgyuvMftLxNRWRA Urletq2t3yt7AdpJCh3iQ3QTozEqyN7QLo95X/QdJfMd6ochP/0O9K51qMdh5epaCrdL YGpKy+nFy79nBJ46mDySqEnTmiqCFhi8EchtD/xzwnB/tA419Fs5Ndlc9Is6FFZWalHP Avz7YoelD8l2cll10/wVCbE7vIzCGanEx6AW+O2ffnWs+L/2cieB3JlvOBSjK8Cb+UVn L+zTRH7OVDRkvzJwXsG6RS/OtJ4ddjtrGMlh8+6ekJU8xagnB/b6oOQRHWIDASsTlmYw FfnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047054; x=1726651854; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oIPB93isriVCRhj0WIuzCaItTxePKQBs+tjSg5k4Ryc=; b=WngQziMZ1Qe6k4Fd403zjOy+oCIhStPuUg4VpBNOhtqoMhpRhCfZNz6GyZa3dQZ60A km372inlfuQRnOFENGA4UNF3pUSqK7DRE8vpybMDBc6rKLebMCOFL6QOjUpEfhA9fPst Kj5Ff89HJX9dbe2dT/8QbfquDmyE4eOU1edOaDzbhBQTeAoD4gdbApnwK6srLD7digy0 1YwRFdAxByhzXb83C5UCIGrJ7PaptiNvJMByMXJ7qPXqKxcaUVq5q3HfqQrATuPqnBsU z8JSGxzvhFHMtlYQyOwnWSA2vogAKtwAxRp7n+8F45viZhn4EJABaNT9j0CeDWo55MAj BeQA== X-Forwarded-Encrypted: i=1; AJvYcCVixaSeXkxAfKAJfF/68p1IkgUNUG+7xu61HnmRUMDpW1IeEqKjbIXwkmTJgsPFbwpNXr8INbJqFYh7xjE=@vger.kernel.org X-Gm-Message-State: AOJu0YyBS3d6C7G52V1p3dCyfnzH+3JCNaCTX3SNZIKFkH4N106f72IN oxNzo5hIktBIYjlf+tpq5NA86/6DVic3HCurGv20dkslU9tUBQKJtCOXLTiFMefYprY7OfTe8dF mlFxLUWpH7F6zj32OvA== X-Google-Smtp-Source: AGHT+IF2SiSlHA5A9IgUvxBmmpL9JBXohsa0oAK6x+0ugHFFFRQb+SttHcejAg07bk/cX4Oc/0ZawYqPIQm+mLqf X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a25:d607:0:b0:e1a:44fa:f09 with SMTP id 3f1490d57ef6-e1d8c260cc1mr2900276.2.1726047053855; Wed, 11 Sep 2024 02:30:53 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:20 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-5-vdonnefort@google.com> Subject: [PATCH 04/13] timekeeping: Add the boot clock to system time snapshot From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort , John Stultz , Thomas Gleixner , Stephen Boyd , "Christopher S. Hall" , Richard Cochran , Lakshmi Sowjanya D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For tracing purpose, the boot clock is interesting as it doesn't stop on suspend. Export it as part of the time snapshot. This will later allow the hypervisor to add boot clock timestamps to its events. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Christopher S. Hall Cc: Richard Cochran Cc: Lakshmi Sowjanya D Signed-off-by: Vincent Donnefort diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index fc12a9ba2c88..e85c27347e44 100644 Acked-by: John Stultz --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -275,6 +275,7 @@ struct ktime_timestamps { * counter value * @cycles: Clocksource counter value to produce the system times * @real: Realtime system time + * @boot: Boot time * @raw: Monotonic raw system time * @cs_id: Clocksource ID * @clock_was_set_seq: The sequence number of clock-was-set events @@ -283,6 +284,7 @@ struct ktime_timestamps { struct system_time_snapshot { u64 cycles; ktime_t real; + ktime_t boot; ktime_t raw; enum clocksource_ids cs_id; unsigned int clock_was_set_seq; diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 5391e4167d60..db16c44dccc3 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1060,6 +1060,7 @@ void ktime_get_snapshot(struct system_time_snapshot *= systime_snapshot) unsigned int seq; ktime_t base_raw; ktime_t base_real; + ktime_t base_boot; u64 nsec_raw; u64 nsec_real; u64 now; @@ -1074,6 +1075,8 @@ void ktime_get_snapshot(struct system_time_snapshot *= systime_snapshot) systime_snapshot->clock_was_set_seq =3D tk->clock_was_set_seq; base_real =3D ktime_add(tk->tkr_mono.base, tk_core.timekeeper.offs_real); + base_boot =3D ktime_add(tk->tkr_mono.base, + tk_core.timekeeper.offs_boot); base_raw =3D tk->tkr_raw.base; nsec_real =3D timekeeping_cycles_to_ns(&tk->tkr_mono, now); nsec_raw =3D timekeeping_cycles_to_ns(&tk->tkr_raw, now); @@ -1081,6 +1084,7 @@ void ktime_get_snapshot(struct system_time_snapshot *= systime_snapshot) =20 systime_snapshot->cycles =3D now; systime_snapshot->real =3D ktime_add_ns(base_real, nsec_real); + systime_snapshot->boot =3D ktime_add_ns(base_boot, nsec_real); systime_snapshot->raw =3D ktime_add_ns(base_raw, nsec_raw); } EXPORT_SYMBOL_GPL(ktime_get_snapshot); --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE903185940 for ; Wed, 11 Sep 2024 09:30:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047059; cv=none; b=p3+TZUF+XKRl5BOm5yA0hVpzB/OWzBxJ9STJBCQ+UaeM6i1VWC1bHfRF4SIbn3ieLJF3QFwL8zKavYQXBYfi6OdpVlM4CPklGFpCrSBhTEr0rff5jAcbrELqLae2bmS7TLMFInYI44fqqR6HNQyhzBzlPjl4qplIEiuurPYyHcc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047059; c=relaxed/simple; bh=WyNAlQoRSlaQbJNtJku/DZE31pU2tETQT6EsDmgk7cU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=P9TFeJgYUZDDZX0skV/ZstfmhqidZgGZTiYiuaxAzwv54PIx7hNWaYSkgOF6QBY+LyQliqNyW6G8DztBBWhNMm9FHWmI1i5hz6Mt8FwwCgDxMz+51WucXn/DBfZ5MpAhVZkTA+80yC/YGl8/qCQZTDwwsnYHuH4KdyZqHCvTUMc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1yr2aUjo; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1yr2aUjo" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-374c3402d93so3503102f8f.0 for ; Wed, 11 Sep 2024 02:30:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047056; x=1726651856; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yejuAYdn5/QbIt85/V9/g+97PUpnS0RfPemDihHOcTE=; b=1yr2aUjoJT20kF+eahXdW+v0qb2zWTxzO5405VPXsGrQpHQ1Eo3Cr3Ql6a9u9sAeG2 XxBRhI4Mha3DuLIfTz0p5wKTFhrna7Iv2tX7PYjIaAaRoq2N0ZxaZO8STKOqd3/1zCMw u4sO6OKXSqd7lCDATPwgV9/qvDjUJkWE05H7Cs1KPnfpVB8XYdNmYjFOgdhCRGex9KAA r796uwqbe7WYOAqDI6UKahOsTEdLGo/EAMNiychGrGf06l+b5FCJxGwBmbiXfHxLrtAc eIK+tNfoiGOpZ5bLFCNH5OkujyN2ljrkKYrRpKr0UWsSAU3WYTVDX7lDyyrwO9WRwhDh ub5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047056; x=1726651856; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yejuAYdn5/QbIt85/V9/g+97PUpnS0RfPemDihHOcTE=; b=voati8RikQlAnWnX38nJVa4+40AKuCtTIgyFFNovW69Z+hB2CpS04dQydYYoPQqG8/ JKQM2TlYtmbFuqyVRnFRbRvu4cVoR7slRH9W3knSQHN88Jbq5wJYHRT6+PAhE+rYO2bf +wuh3R8bXO1gOXnINsSoHJd5keao0l2nV49dLABR7pe817thwh7/zZHAQLAsasXdbv01 7XxGh4gjhI0I+TrRjK3MpqeJNecioCl30Z6hD87ZoQ/gD+rRDVdmdMtmAol6SgMlMRhE WVe/hQTIT5AyVu/T46xnlowy5eG2yNifTNqLKHcGhQp4bD843vdaswPoHaK/cJ0vAjR5 elhg== X-Forwarded-Encrypted: i=1; AJvYcCULjRUWQBqlgXwWCKESjU6kc296tuuJDjkUoQx4zthCCV5VWI/tL/V8YMmv+gt8rooXQnvea6KwkwnwhyI=@vger.kernel.org X-Gm-Message-State: AOJu0YwjGI79Att8tWXVHq8UhfCmQ8d0eZQr9D29oC6leokHGN6H0LNu 0b3brjJqzks696wZ99grnXUiuI1UHoAzXvmVw340XdAshCWsGPo/bNWJXi0t19IQXRfIjr1NAXe pEul5dgMOKcN/jZUgzw== X-Google-Smtp-Source: AGHT+IH3JK62CsMjZY6SFuOrVbxoHTbhIFQKDf2kfn8b46cEmKNMGKp9tKKeHqg7ZrDPIMVbTdlp++65L0GaDZBF X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:adf:edc2:0:b0:374:b238:79e5 with SMTP id ffacd0b85a97d-378895c63e4mr21238f8f.2.1726047056142; Wed, 11 Sep 2024 02:30:56 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:21 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-6-vdonnefort@google.com> Subject: [PATCH 05/13] KVM: arm64: Support unaligned fixmap in the nVHE hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Return the fixmap VA with the page offset, instead of the page base address. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index 8850b591d775..c5a9d8874eb2 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -240,7 +240,7 @@ void *hyp_fixmap_map(phys_addr_t phys) WRITE_ONCE(*ptep, pte); dsb(ishst); =20 - return (void *)slot->addr; + return (void *)slot->addr + offset_in_page(phys); } =20 static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94105185B57 for ; Wed, 11 Sep 2024 09:30:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047061; cv=none; b=SF6/GAWbf/rEGZ+s14vC3qUZAxX0XZ67ei51Owqn2389NFHN90xkYws7F/T+qBknVfuiNZ6ZpDCefqpvqsqpeG+JBOfVW8A+pKo67srk6PWVGKPNsd0s5Rm/Bf5kyMbq9rPDFMaGEWvuHbylpl9j5qzMZ20+mvC98vwyEjlz1WY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047061; c=relaxed/simple; bh=ercdJaF3u/T3mJulQVQrQ+CziDme01YfrcmkWJ0mHnY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JC2zhs4T6IyNqU+1p0IUxtS8zJgWfE6VYfm1j719Vm76eqaR1UOTGspelovrPn6Tk26k6nL2/dMVXdV9j26oNNVml6L2SRluVap94iiyEEuDfDJuja581xgSocxbet+4muBtPh6hPz3PMgqbjQJcFCc7V1d/NKd4XYkQUXCDWqg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=f997Knzr; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="f997Knzr" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6d683cfa528so32752937b3.0 for ; Wed, 11 Sep 2024 02:30:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047058; x=1726651858; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cRuTUcBMWLC1zDDkXV5OT7FrBuwwjdnt5cv0szV7Wso=; b=f997KnzrTcXBczfqXScCq717lQ/XSuGG3CIKXra6+Uj2Ggz1I03Swkn8hPWzS9yAgE 0/e2qMaO/OGFQKB1lGGKYzeJKKd1Iej5aeZRChiCwMohcdl+U+IW/Omvj8D0K227r2ic HFHS+K5J5KjzHZBdKT1SD1UBw8opESt9MMOf35hAe4weDf57QdY1/k5it0RpuCVPbHw1 +h11u/cXMp9m85fBI3TnBtdXzBBfjx1qXmZv8PhM29NluWBhLaKwBf+PceHydQfRsGd6 fxHEAAq3EaG1o0v+8iBU6G/tv7/b1lNbduI3YOYAt3szM9dGGHF1h/DYbcwdnaGEbKTX V1xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047058; x=1726651858; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cRuTUcBMWLC1zDDkXV5OT7FrBuwwjdnt5cv0szV7Wso=; b=d5Rp2hh7ezVPrByJXDjhjfLGnek/W0O8a4QwdQORzWKarv5tcw4pWlECXDO5Zr24BH Dfm5PbNOYNQn3yNFyFi8DKGVF/n5b2+pYiACjU6egM/X/YKCulc+OgVnWdfkLhtc8L4t l70zrKst/mrMXIniNdJtFIyypfxusb9VTTHl+Cv2VeHJZXfjYm+FgD7TgNK9ohJxRKzW 0DkjGOI2gQNBsQ4d9G2lM+5dL+l0e/c4PsU3aids5qeoAt3cyZaymoBOhOHKZorAHO07 PO6Emon+YMtkjof9FzsoTExJCrU5MaqayO5D8JNNOov/vI4WcLxig9vnYnsdzoKJrCSL /YYA== X-Forwarded-Encrypted: i=1; AJvYcCWJAyP4Lq9seaDvBHeTSq26Kj9CnvIyJzgio6ZsbJRnLGqZp4ULCB4Gx9dNcfR8OBz8DDm4dh+Ua9gQkl0=@vger.kernel.org X-Gm-Message-State: AOJu0Yw0tVGGnvTGP7dN9josyyW+DC6HCfNu+2ZGfw5PurCakRtDOuuK iF4XtzOtpPIyNMgVTy2LzC6rd8YPKXGyYkW30qgHbudGfSSADKUGD0RfYHF8yItG9g6I/niaY0f mpxw/bnYaGPrqkXDxRQ== X-Google-Smtp-Source: AGHT+IHHGSLQ/I9Luq2ns4pz84GK3ZlHzaoiQmxweyshiJGssvJJAZ8RAzx7lLonh1t0WfuSRuHaXXVsKioBQ4E1 X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:690c:2c12:b0:66a:764f:e57f with SMTP id 00721157ae682-6dba6e28b59mr343267b3.7.1726047058573; Wed, 11 Sep 2024 02:30:58 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:22 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-7-vdonnefort@google.com> Subject: [PATCH 06/13] KVM: arm64: Add clock support in the nVHE hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" By default, the arm64 host kernel is using the arch timer as a source for sched_clock. Conveniently, EL2 has access to that same counter, allowing to generate clock values that are synchronized. The clock needs nonetheless to be setup with the same slope values as the kernel. Introducing at the same time trace_clock() which is expected to be later configured by the hypervisor tracing. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_= hyp.h index c838309e4ec4..355bae0056f0 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -144,5 +144,4 @@ extern u64 kvm_nvhe_sym(id_aa64smfr0_el1_sys_val); extern unsigned long kvm_nvhe_sym(__icache_flags); extern unsigned int kvm_nvhe_sym(kvm_arm_vmid_bits); extern unsigned int kvm_nvhe_sym(kvm_host_sve_max_vl); - #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/clock.h b/arch/arm64/kvm/hyp/i= nclude/nvhe/clock.h new file mode 100644 index 000000000000..2bd05b3b89f9 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/clock.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ARM64_KVM_HYP_NVHE_CLOCK_H +#define __ARM64_KVM_HYP_NVHE_CLOCK_H +#include + +#include + +#ifdef CONFIG_TRACING +void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc); +u64 trace_clock(void); +#else +static inline void +trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) { } +static inline u64 trace_clock(void) { return 0; } +#endif +#endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Mak= efile index b43426a493df..323e992089bd 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,6 +28,7 @@ hyp-obj-y :=3D timer-sr.o sysreg-sr.o debug-sr.o switch.o= tlb.o hyp-init.o host.o hyp-obj-y +=3D ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../en= try.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) +=3D list_debug.o +hyp-obj-$(CONFIG_TRACING) +=3D clock.o hyp-obj-y +=3D $(lib-objs) =20 ## diff --git a/arch/arm64/kvm/hyp/nvhe/clock.c b/arch/arm64/kvm/hyp/nvhe/cloc= k.c new file mode 100644 index 000000000000..0d1f74bc2e11 --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/clock.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Google LLC + * Author: Vincent Donnefort + */ + +#include + +#include +#include + +static struct clock_data { + struct { + u32 mult; + u32 shift; + u64 epoch_ns; + u64 epoch_cyc; + } data[2]; + u64 cur; +} trace_clock_data; + +/* Does not guarantee no reader on the modified bank. */ +void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) +{ + struct clock_data *clock =3D &trace_clock_data; + u64 bank =3D clock->cur ^ 1; + + clock->data[bank].mult =3D mult; + clock->data[bank].shift =3D shift; + clock->data[bank].epoch_ns =3D epoch_ns; + clock->data[bank].epoch_cyc =3D epoch_cyc; + + smp_store_release(&clock->cur, bank); +} + +/* Using host provided data. Do not use for anything else than debugging. = */ +u64 trace_clock(void) +{ + struct clock_data *clock =3D &trace_clock_data; + u64 bank =3D smp_load_acquire(&clock->cur); + u64 cyc, ns; + + cyc =3D __arch_counter_get_cntpct() - clock->data[bank].epoch_cyc; + + ns =3D cyc * clock->data[bank].mult; + ns >>=3D clock->data[bank].shift; + + return (u64)ns + clock->data[bank].epoch_ns; +} --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 080DA17BED4 for ; Wed, 11 Sep 2024 09:31:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047064; cv=none; b=RsewEhGqJunffhK9OqJW4Kht/KnAl+NyinXvSaI4ct0LMQloAZBhGf9v0U+RYWrEG3Qo0fXbS34C3JIs6Aotp6zAiMdNJkQiuOQWYxXLaHuw+EnvlWnRV3rrZGBFGFfpyFaX64QGduA0BBfrNj7xjooH0AcCeSygVcx6bBmbmGQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047064; c=relaxed/simple; bh=fVrrnqAXINNnmz4l5DTWIioC5ArvYscT0DRv4Dnazu0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cvj3eXwIJFa3w8F+z2zKe4GqlVwopEsZHY/umLjwiPN4lXU0qLIYblV3fjZWoQIy2JtylUBxp5Po7rNHavA1MFk9uV7unEKnmxLKLG+TTESEuz3WwDFmYoXHIwOd7VlEDga+ImqJYa76LcNbflNQMZKRzAbdd5ShlabdF88O7kw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Bnkb15Hl; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Bnkb15Hl" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e178e745c49so2730672276.2 for ; Wed, 11 Sep 2024 02:31:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047061; x=1726651861; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BW6cDyUAZDFLLhBBcBgBD1d1WBUq4v2u9TJ9S5PEGAo=; b=Bnkb15Hl0f2yb6x5d9BIJscW2JLkONruJdhL1PGpdEb/ivvlW/Mmamwo95fyjd9DKT ypMJ8MGPHKECLE8EHA+XQvEaukIIk1NRKrYqT3dAi/8wcc38vFvUtrE7FHcCb3n21lhp tPeY/rSJEbIta5wW/Xm+Tc0M6dUDIl6KFGouvOFfmX45mEjhxY1zmKjt0vbhZO6j02Tj Jz9185fSXpuGP+ze16GSAfGFfMFe4kmeI6HvH0X/lg0yKH7kyhRqP8eTrIlF1cNwJQBQ ofkhUayOczGik8tgpKtlUcwKSncGcJC8sMHDgUD80FZaLrRFRdfwjUo+BmAZRRTkv/SY E78A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047061; x=1726651861; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BW6cDyUAZDFLLhBBcBgBD1d1WBUq4v2u9TJ9S5PEGAo=; b=oE/Kj1ecghORXzb614n1WKm5kfPOwpBtG/y6oRvkmTq2r/kCFRpWZ5vhbk6ntZf/na ztinIOfkgPYSXdpUjtOpnjxy6i/Zc0BEQwr0Ynwhm1P94KqzkgbWo3ROtDLXnN4YlznW YAWee2WX6IkHJWMPJ6vICbi7DVWmc++3Y7F6Jl9ix4hW3G+yUNmc01JnCW0mHXjQhp9z 2bN5v7fklrO5gI2F7Mtni1TZ4h2ejawsKWHKMDJIfy6X2/0bEkgOZ0dWijtaBp60JTFM Pd12ffiuoftPoWND0fvATR0D2fPGaByKhD3Pn894Qo1iRrelrL1/eMnyJHpqTpyqlSAd QCww== X-Forwarded-Encrypted: i=1; AJvYcCWKypZ9dHMnPnTy5ZytVpz96cCfiJOvtxzOBrazvZlArpLyHg3GqMmChjJzVtHSingpgDhwnq3qSUNy3P8=@vger.kernel.org X-Gm-Message-State: AOJu0YycB0eJJ6p8xCVxCR/TeNhGrhQ3f4n9KcISnLmi7OblQsH6AQ4Y XQJI/kgQpr43ZQVbPVF3pzyO54MDPMOHniqCfe2ZEe/w2jDuX9/kjb3jJJ0CXeJOGHebhFzIf/o 0w67bii8UpkqIAszncg== X-Google-Smtp-Source: AGHT+IGPDUn03DgkkKdYF56UHDvokuhbmUsMW0zoMfLQGSf/Vzy0b/bfrPUfWjum7Fva2mcvgQIVIRFL9OzNKtVB X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a25:e08c:0:b0:e0e:8b26:484e with SMTP id 3f1490d57ef6-e1d8c5319c3mr2976276.8.1726047060930; Wed, 11 Sep 2024 02:31:00 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:23 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-8-vdonnefort@google.com> Subject: [PATCH 07/13] KVM: arm64: Add tracing support for the pKVM hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When running with protected mode, the host has very little knowledge about what is happening in the hypervisor. Of course this is an essential feature for security but nonetheless, that piece of code growing with more responsibilities, we need now a way to debug and profile it. Tracefs by its reliatility, versatility and support for user-space is the perfect tool. There's no way the hypervisor could log events directly into the host tracefs ring-buffers. So instead let's use our own, where the hypervisor is the writer and the host the reader. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index 2181a11b9d92..d549d7d491c3 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -79,6 +79,10 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_init_vm, __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm, + __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, }; =20 #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_hyptrace.h b/arch/arm64/include/asm= /kvm_hyptrace.h new file mode 100644 index 000000000000..7da6a248c7fa --- /dev/null +++ b/arch/arm64/include/asm/kvm_hyptrace.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __ARM64_KVM_HYPTRACE_H_ +#define __ARM64_KVM_HYPTRACE_H_ +#include + +#include + +/* + * Host donations to the hypervisor to store the struct hyp_buffer_page. + */ +struct hyp_buffer_pages_backing { + unsigned long start; + size_t size; +}; + +struct hyp_trace_desc { + struct hyp_buffer_pages_backing backing; + struct trace_page_desc page_desc; + +}; +#endif diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/i= nclude/nvhe/trace.h new file mode 100644 index 000000000000..a7c0c73af0e5 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __ARM64_KVM_HYP_NVHE_TRACE_H +#define __ARM64_KVM_HYP_NVHE_TRACE_H +#include + +/* Internal struct that needs export for hyp-constants.c */ +struct hyp_buffer_page { + struct list_head list; + struct buffer_data_page *page; + unsigned long write; + unsigned long entries; + u32 id; +}; + +#ifdef CONFIG_TRACING +void *tracing_reserve_entry(unsigned long length); +void tracing_commit_entry(void); + +int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); +void __pkvm_teardown_tracing(void); +int __pkvm_enable_tracing(bool enable); +int __pkvm_swap_reader_tracing(unsigned int cpu); +#else +static inline void *tracing_reserve_entry(unsigned long length) { return N= ULL; } +static inline void tracing_commit_entry(void) { } + +static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_s= ize) { return -ENODEV; } +static inline void __pkvm_teardown_tracing(void) { } +static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } +static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -E= NODEV; } +#endif +#endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Mak= efile index 323e992089bd..40f243c44cf5 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,7 +28,7 @@ hyp-obj-y :=3D timer-sr.o sysreg-sr.o debug-sr.o switch.o= tlb.o hyp-init.o host.o hyp-obj-y +=3D ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../en= try.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) +=3D list_debug.o -hyp-obj-$(CONFIG_TRACING) +=3D clock.o +hyp-obj-$(CONFIG_TRACING) +=3D clock.o trace.o hyp-obj-y +=3D $(lib-objs) =20 ## diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index f43d845f3c4e..1fb3391e122a 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -17,6 +17,7 @@ #include #include #include +#include #include =20 DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); @@ -373,6 +374,35 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_c= ontext *host_ctxt) cpu_reg(host_ctxt, 1) =3D __pkvm_teardown_vm(handle); } =20 +static void handle___pkvm_load_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned long, desc_hva, host_ctxt, 1); + DECLARE_REG(size_t, desc_size, host_ctxt, 2); + + cpu_reg(host_ctxt, 1) =3D __pkvm_load_tracing(desc_hva, desc_size); +} + +static void handle___pkvm_teardown_tracing(struct kvm_cpu_context *host_ct= xt) +{ + __pkvm_teardown_tracing(); + + cpu_reg(host_ctxt, 1) =3D 0; +} + +static void handle___pkvm_enable_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(bool, enable, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) =3D __pkvm_enable_tracing(enable); +} + +static void handle___pkvm_swap_reader_tracing(struct kvm_cpu_context *host= _ctxt) +{ + DECLARE_REG(unsigned int, cpu, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) =3D __pkvm_swap_reader_tracing(cpu); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); =20 #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] =3D (hcall_t)handle_##x @@ -405,6 +435,10 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__pkvm_init_vm), HANDLE_FUNC(__pkvm_init_vcpu), HANDLE_FUNC(__pkvm_teardown_vm), + HANDLE_FUNC(__pkvm_load_tracing), + HANDLE_FUNC(__pkvm_teardown_tracing), + HANDLE_FUNC(__pkvm_enable_tracing), + HANDLE_FUNC(__pkvm_swap_reader_tracing), }; =20 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trac= e.c new file mode 100644 index 000000000000..debb3ee7dd3a --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -0,0 +1,589 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 Google LLC + * Author: Vincent Donnefort + */ + +#include +#include +#include +#include + +#include +#include +#include + +#define HYP_RB_PAGE_HEAD 1UL +#define HYP_RB_PAGE_UPDATE 2UL +#define HYP_RB_FLAG_MASK 3UL + +struct hyp_rb_per_cpu { + struct trace_buffer_meta *meta; + struct hyp_buffer_page *tail_page; + struct hyp_buffer_page *reader_page; + struct hyp_buffer_page *head_page; + struct hyp_buffer_page *bpages; + unsigned long nr_pages; + unsigned long last_overrun; + u64 write_stamp; + atomic_t status; +}; + +#define HYP_RB_UNAVAILABLE 0 +#define HYP_RB_READY 1 +#define HYP_RB_WRITING 2 + +static struct hyp_buffer_pages_backing hyp_buffer_pages_backing; +DEFINE_PER_CPU(struct hyp_rb_per_cpu, trace_rb); +DEFINE_HYP_SPINLOCK(trace_rb_lock); + +static bool rb_set_flag(struct hyp_buffer_page *bpage, int new_flag) +{ + unsigned long ret, val =3D (unsigned long)bpage->list.next; + + ret =3D cmpxchg((unsigned long *)&bpage->list.next, + val, (val & ~HYP_RB_FLAG_MASK) | new_flag); + + return ret =3D=3D val; +} + +static struct hyp_buffer_page *rb_hyp_buffer_page(struct list_head *list) +{ + unsigned long ptr =3D (unsigned long)list & ~HYP_RB_FLAG_MASK; + + return container_of((struct list_head *)ptr, struct hyp_buffer_page, list= ); +} + +static struct hyp_buffer_page *rb_next_page(struct hyp_buffer_page *bpage) +{ + return rb_hyp_buffer_page(bpage->list.next); +} + +static bool rb_is_head_page(struct hyp_buffer_page *bpage) +{ + return (unsigned long)bpage->list.prev->next & HYP_RB_PAGE_HEAD; +} + +static struct hyp_buffer_page *rb_set_head_page(struct hyp_rb_per_cpu *cpu= _buffer) +{ + struct hyp_buffer_page *bpage, *prev_head; + int cnt =3D 0; + +again: + bpage =3D prev_head =3D cpu_buffer->head_page; + do { + if (rb_is_head_page(bpage)) { + cpu_buffer->head_page =3D bpage; + return bpage; + } + + bpage =3D rb_next_page(bpage); + } while (bpage !=3D prev_head); + + /* We might have race with the writer let's try again */ + if (++cnt < 3) + goto again; + + return NULL; +} + +static int rb_swap_reader_page(struct hyp_rb_per_cpu *cpu_buffer) +{ + unsigned long *old_head_link, old_link_val, new_link_val, overrun; + struct hyp_buffer_page *head, *reader =3D cpu_buffer->reader_page; + +spin: + /* Update the cpu_buffer->header_page according to HYP_RB_PAGE_HEAD */ + head =3D rb_set_head_page(cpu_buffer); + if (!head) + return -ENODEV; + + /* Connect the reader page around the header page */ + reader->list.next =3D head->list.next; + reader->list.prev =3D head->list.prev; + + /* The reader page points to the new header page */ + rb_set_flag(reader, HYP_RB_PAGE_HEAD); + + /* + * Paired with the cmpxchg in rb_move_tail(). Order the read of the head + * page and overrun. + */ + smp_mb(); + overrun =3D READ_ONCE(cpu_buffer->meta->overrun); + + /* Try to swap the prev head link to the reader page */ + old_head_link =3D (unsigned long *)&reader->list.prev->next; + old_link_val =3D (*old_head_link & ~HYP_RB_FLAG_MASK) | HYP_RB_PAGE_HEAD; + new_link_val =3D (unsigned long)&reader->list; + if (cmpxchg(old_head_link, old_link_val, new_link_val) + !=3D old_link_val) + goto spin; + + cpu_buffer->head_page =3D rb_hyp_buffer_page(reader->list.next); + cpu_buffer->head_page->list.prev =3D &reader->list; + cpu_buffer->reader_page =3D head; + cpu_buffer->meta->reader.lost_events =3D overrun - cpu_buffer->last_overr= un; + cpu_buffer->meta->reader.id =3D cpu_buffer->reader_page->id; + cpu_buffer->last_overrun =3D overrun; + + return 0; +} + +static struct hyp_buffer_page * +rb_move_tail(struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *tail_page, *new_tail, *new_head; + + tail_page =3D cpu_buffer->tail_page; + new_tail =3D rb_next_page(tail_page); + +again: + /* + * We caught the reader ... Let's try to move the head page. + * The writer can only rely on ->next links to check if this is head. + */ + if ((unsigned long)tail_page->list.next & HYP_RB_PAGE_HEAD) { + /* The reader moved the head in between */ + if (!rb_set_flag(tail_page, HYP_RB_PAGE_UPDATE)) + goto again; + + WRITE_ONCE(cpu_buffer->meta->overrun, + cpu_buffer->meta->overrun + new_tail->entries); + WRITE_ONCE(meta_pages_lost(cpu_buffer->meta), + meta_pages_lost(cpu_buffer->meta) + 1); + + /* Move the head */ + rb_set_flag(new_tail, HYP_RB_PAGE_HEAD); + + /* The new head is in place, reset the update flag */ + rb_set_flag(tail_page, 0); + + new_head =3D rb_next_page(new_tail); + } + + local_set(&new_tail->page->commit, 0); + + new_tail->write =3D 0; + new_tail->entries =3D 0; + + WRITE_ONCE(meta_pages_touched(cpu_buffer->meta), + meta_pages_touched(cpu_buffer->meta) + 1); + cpu_buffer->tail_page =3D new_tail; + + return new_tail; +} + +static unsigned long rb_event_size(unsigned long length) +{ + struct ring_buffer_event *event; + + return length + RB_EVNT_HDR_SIZE + sizeof(event->array[0]); +} + +static struct ring_buffer_event * +rb_add_ts_extend(struct ring_buffer_event *event, u64 delta) +{ + event->type_len =3D RINGBUF_TYPE_TIME_EXTEND; + event->time_delta =3D delta & TS_MASK; + event->array[0] =3D delta >> TS_SHIFT; + + return (struct ring_buffer_event *)((unsigned long)event + 8); +} + +static struct ring_buffer_event * +rb_reserve_next(struct hyp_rb_per_cpu *cpu_buffer, unsigned long length) +{ + unsigned long ts_ext_size =3D 0, event_size =3D rb_event_size(length); + struct hyp_buffer_page *tail_page =3D cpu_buffer->tail_page; + struct ring_buffer_event *event; + unsigned long write, prev_write; + u64 ts, time_delta; + + ts =3D trace_clock(); + + time_delta =3D ts - cpu_buffer->write_stamp; + + if (test_time_stamp(time_delta)) + ts_ext_size =3D 8; + + prev_write =3D tail_page->write; + write =3D prev_write + event_size + ts_ext_size; + + if (unlikely(write > BUF_PAGE_SIZE)) + tail_page =3D rb_move_tail(cpu_buffer); + + if (!tail_page->entries) { + tail_page->page->time_stamp =3D ts; + time_delta =3D 0; + ts_ext_size =3D 0; + write =3D event_size; + prev_write =3D 0; + } + + tail_page->write =3D write; + tail_page->entries++; + + cpu_buffer->write_stamp =3D ts; + + event =3D (struct ring_buffer_event *)(tail_page->page->data + + prev_write); + if (ts_ext_size) { + event =3D rb_add_ts_extend(event, time_delta); + time_delta =3D 0; + } + + event->type_len =3D 0; + event->time_delta =3D time_delta; + event->array[0] =3D event_size - RB_EVNT_HDR_SIZE; + + return event; +} + +void *tracing_reserve_entry(unsigned long length) +{ + struct hyp_rb_per_cpu *cpu_buffer =3D this_cpu_ptr(&trace_rb); + struct ring_buffer_event *rb_event; + + if (atomic_cmpxchg(&cpu_buffer->status, HYP_RB_READY, HYP_RB_WRITING) + =3D=3D HYP_RB_UNAVAILABLE) + return NULL; + + rb_event =3D rb_reserve_next(cpu_buffer, length); + + return &rb_event->array[1]; +} + +void tracing_commit_entry(void) +{ + struct hyp_rb_per_cpu *cpu_buffer =3D this_cpu_ptr(&trace_rb); + + local_set(&cpu_buffer->tail_page->page->commit, + cpu_buffer->tail_page->write); + WRITE_ONCE(cpu_buffer->meta->entries, + cpu_buffer->meta->entries + 1); + + /* Paired with rb_cpu_disable_writing() */ + atomic_set_release(&cpu_buffer->status, HYP_RB_READY); +} + +static int rb_page_init(struct hyp_buffer_page *bpage, unsigned long hva) +{ + void *hyp_va =3D (void *)kern_hyp_va(hva); + int ret; + + ret =3D hyp_pin_shared_mem(hyp_va, hyp_va + PAGE_SIZE); + if (ret) + return ret; + + INIT_LIST_HEAD(&bpage->list); + bpage->page =3D (struct buffer_data_page *)hyp_va; + + local_set(&bpage->page->commit, 0); + + return 0; +} + +static bool rb_cpu_loaded(struct hyp_rb_per_cpu *cpu_buffer) +{ + return !!cpu_buffer->bpages; +} + +static void rb_cpu_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) +{ + int prev_status; + + /* Wait for the buffer to be released */ + do { + prev_status =3D atomic_cmpxchg_acquire(&cpu_buffer->status, + HYP_RB_READY, + HYP_RB_UNAVAILABLE); + } while (prev_status =3D=3D HYP_RB_WRITING); +} + +static int rb_cpu_enable_writing(struct hyp_rb_per_cpu *cpu_buffer) +{ + if (!rb_cpu_loaded(cpu_buffer)) + return -ENODEV; + + atomic_cmpxchg(&cpu_buffer->status, HYP_RB_UNAVAILABLE, HYP_RB_READY); + + return 0; +} + +static void rb_cpu_teardown(struct hyp_rb_per_cpu *cpu_buffer) +{ + int i; + + if (!rb_cpu_loaded(cpu_buffer)) + return; + + rb_cpu_disable_writing(cpu_buffer); + + hyp_unpin_shared_mem((void *)cpu_buffer->meta, + (void *)(cpu_buffer->meta) + PAGE_SIZE); + + for (i =3D 0; i < cpu_buffer->nr_pages; i++) { + struct hyp_buffer_page *bpage =3D &cpu_buffer->bpages[i]; + + if (!bpage->page) + continue; + + hyp_unpin_shared_mem((void *)bpage->page, + (void *)bpage->page + PAGE_SIZE); + } + + cpu_buffer->bpages =3D 0; +} + +static bool rb_cpu_fits_backing(unsigned long nr_pages, + struct hyp_buffer_page *start) +{ + unsigned long max =3D hyp_buffer_pages_backing.start + + hyp_buffer_pages_backing.size; + struct hyp_buffer_page *end =3D start + nr_pages; + + return (unsigned long)end <=3D max; +} + +static bool rb_cpu_fits_desc(struct rb_page_desc *pdesc, + unsigned long desc_end) +{ + unsigned long *end; + + /* Check we can at least read nr_pages */ + if ((unsigned long)&pdesc->nr_page_va >=3D desc_end) + return false; + + end =3D &pdesc->page_va[pdesc->nr_page_va]; + + return (unsigned long)end <=3D desc_end; +} + +static int rb_cpu_init(struct rb_page_desc *pdesc, struct hyp_buffer_page = *start, + struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *bpage =3D start; + int i, ret; + + /* At least 1 reader page and one head */ + if (pdesc->nr_page_va < 2) + return -EINVAL; + + if (!rb_cpu_fits_backing(pdesc->nr_page_va, start)) + return -EINVAL; + + if (rb_cpu_loaded(cpu_buffer)) + return -EBUSY; + + cpu_buffer->bpages =3D start; + + cpu_buffer->meta =3D (struct trace_buffer_meta *)kern_hyp_va(pdesc->meta_= va); + ret =3D hyp_pin_shared_mem((void *)cpu_buffer->meta, + ((void *)cpu_buffer->meta) + PAGE_SIZE); + if (ret) + return ret; + + memset(cpu_buffer->meta, 0, sizeof(*cpu_buffer->meta)); + cpu_buffer->meta->meta_page_size =3D PAGE_SIZE; + cpu_buffer->meta->nr_subbufs =3D cpu_buffer->nr_pages; + + /* The reader page is not part of the ring initially */ + ret =3D rb_page_init(bpage, pdesc->page_va[0]); + if (ret) + goto err; + + cpu_buffer->nr_pages =3D 1; + + cpu_buffer->reader_page =3D bpage; + cpu_buffer->tail_page =3D bpage + 1; + cpu_buffer->head_page =3D bpage + 1; + + for (i =3D 1; i < pdesc->nr_page_va; i++) { + ret =3D rb_page_init(++bpage, pdesc->page_va[i]); + if (ret) + goto err; + + bpage->list.next =3D &(bpage + 1)->list; + bpage->list.prev =3D &(bpage - 1)->list; + bpage->id =3D i; + + cpu_buffer->nr_pages =3D i + 1; + } + + /* Close the ring */ + bpage->list.next =3D &cpu_buffer->tail_page->list; + cpu_buffer->tail_page->list.prev =3D &bpage->list; + + /* The last init'ed page points to the head page */ + rb_set_flag(bpage, HYP_RB_PAGE_HEAD); + + cpu_buffer->last_overrun =3D 0; + + return 0; + +err: + rb_cpu_teardown(cpu_buffer); + + return ret; +} + +static int rb_setup_bpage_backing(struct hyp_trace_desc *desc) +{ + unsigned long start =3D kern_hyp_va(desc->backing.start); + size_t size =3D desc->backing.size; + int ret; + + if (hyp_buffer_pages_backing.size) + return -EBUSY; + + if (!PAGE_ALIGNED(start) || !PAGE_ALIGNED(size)) + return -EINVAL; + + ret =3D __pkvm_host_donate_hyp(hyp_virt_to_pfn((void *)start), size >> PA= GE_SHIFT); + if (ret) + return ret; + + memset((void *)start, 0, size); + + hyp_buffer_pages_backing.start =3D start; + hyp_buffer_pages_backing.size =3D size; + + return 0; +} + +static void rb_teardown_bpage_backing(void) +{ + unsigned long start =3D hyp_buffer_pages_backing.start; + size_t size =3D hyp_buffer_pages_backing.size; + + if (!size) + return; + + memset((void *)start, 0, size); + + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(start), size >> PAGE_SHIFT= )); + + hyp_buffer_pages_backing.start =3D 0; + hyp_buffer_pages_backing.size =3D 0; +} + +int __pkvm_swap_reader_tracing(unsigned int cpu) +{ + struct hyp_rb_per_cpu *cpu_buffer; + int ret =3D 0; + + if (cpu >=3D hyp_nr_cpus) + return -EINVAL; + + hyp_spin_lock(&trace_rb_lock); + + cpu_buffer =3D per_cpu_ptr(&trace_rb, cpu); + if (!rb_cpu_loaded(cpu_buffer)) + ret =3D -ENODEV; + else + ret =3D rb_swap_reader_page(cpu_buffer); + + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} + +static void __pkvm_teardown_tracing_locked(void) +{ + int cpu; + + hyp_assert_lock_held(&trace_rb_lock); + + for (cpu =3D 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer =3D per_cpu_ptr(&trace_rb, cpu); + + rb_cpu_teardown(cpu_buffer); + } + + rb_teardown_bpage_backing(); +} + +void __pkvm_teardown_tracing(void) +{ + hyp_spin_lock(&trace_rb_lock); + __pkvm_teardown_tracing_locked(); + hyp_spin_unlock(&trace_rb_lock); +} + +int __pkvm_load_tracing(unsigned long desc_hva, size_t desc_size) +{ + struct hyp_trace_desc *desc =3D (struct hyp_trace_desc *)kern_hyp_va(desc= _hva); + struct trace_page_desc *trace_pdesc =3D &desc->page_desc; + struct hyp_buffer_page *bpage_backing_start; + struct rb_page_desc *pdesc; + int ret, cpu; + + if (!desc_size || !PAGE_ALIGNED(desc_hva) || !PAGE_ALIGNED(desc_size)) + return -EINVAL; + + ret =3D __pkvm_host_donate_hyp(hyp_virt_to_pfn((void *)desc), + desc_size >> PAGE_SHIFT); + if (ret) + return ret; + + hyp_spin_lock(&trace_rb_lock); + + ret =3D rb_setup_bpage_backing(desc); + if (ret) + goto err; + + bpage_backing_start =3D (struct hyp_buffer_page *)hyp_buffer_pages_backin= g.start; + + for_each_rb_page_desc(pdesc, cpu, trace_pdesc) { + struct hyp_rb_per_cpu *cpu_buffer; + int cpu; + + ret =3D -EINVAL; + if (!rb_cpu_fits_desc(pdesc, desc_hva + desc_size)) + break; + + cpu =3D pdesc->cpu; + if (cpu >=3D hyp_nr_cpus) + break; + + cpu_buffer =3D per_cpu_ptr(&trace_rb, cpu); + + ret =3D rb_cpu_init(pdesc, bpage_backing_start, cpu_buffer); + if (ret) + break; + + bpage_backing_start +=3D pdesc->nr_page_va; + } + +err: + if (ret) + __pkvm_teardown_tracing_locked(); + + hyp_spin_unlock(&trace_rb_lock); + + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn((void *)desc), + desc_size >> PAGE_SHIFT)); + return ret; +} + +int __pkvm_enable_tracing(bool enable) +{ + int cpu, ret =3D enable ? -EINVAL : 0; + + hyp_spin_lock(&trace_rb_lock); + for (cpu =3D 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer =3D per_cpu_ptr(&trace_rb, cpu); + + if (enable) { + if (!rb_cpu_enable_writing(cpu_buffer)) + ret =3D 0; + } else { + rb_cpu_disable_writing(cpu_buffer); + } + + } + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1308918660F for ; Wed, 11 Sep 2024 09:31:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047068; cv=none; b=NAe1xq2VUkeNb4FTI9BRTxHfniGwNYYnIn0DORHPkTDAfUdE9rWsokpqX2T0pnsJoSmgwX8+Tl/RULvfUsPWjFEYw19F5EI/D2ncvaISUCMMxlOteYolqwxnUgp5Zf6NkoCzi5Xt4DhJ0C3g5x6RAOAkhcYFuv5kbaZTN0pwfcM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047068; c=relaxed/simple; bh=kBguAfPku2pwxcP6/HNnzqauggRGfJg+YGBxt1l3JZ8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ek6Y3Epnk9ot011CTzeEdUtbhW6wvIp92tM8GyLSvru45wHfDz05CwkGgupaAbsL9Usij6SPYn661yGgWYd9YDz/Kt/6Ck+HxVGy15xK6Dfv2iEHnhLMt6hElzIVVSEBNO50mwyOZjt3KXi2FCLRsWzbzZhef1m6ZLwWarXEZRU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ozbUdX0W; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ozbUdX0W" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-42cae209243so29669325e9.1 for ; Wed, 11 Sep 2024 02:31:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047063; x=1726651863; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aFzSpffK8X1TuoaKPBfypqCQObUo450oa9mup+ydrVI=; b=ozbUdX0Wvfs9XDMK+wRrlywKZwhoNsPyLHapW+LvQ8dy7xaDnniTRnzHCD+AdsECkh tRURgyNOPsuC7xm3YnhcGxpLUUNmsPj1GLAOIkAmL4f3tb8wWr/UOcFs5xsMhnnEXPKx 34TdG4x2zLZ0pMHbQHlclwyo6wyNRepboqweMQCZDC47ze0JUa3axICPN9BSqePkBnNO LOlUSIuD+gO251aQq5WQAzQj4rnc/fl34mmJaal8xP5Ot7gisPq2K/TcD3rIRuU90P/j F1avYL2IkKi36XC8ZQtctvbcXYNQw9cWqyYXNXqmk9s58qxLFuwS/zqL5fCf9PW0Wu8J T53A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047063; x=1726651863; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aFzSpffK8X1TuoaKPBfypqCQObUo450oa9mup+ydrVI=; b=IPUCdR+iH4ET/zyGLOfctwGaOErsvLr/oQSO+OF7kl/BBr/fPbVIYufwELvPJf4Rw6 dEEkALblRZWAFXHrr70HO+xwmVfKomQbyBTU80EpN1mrKhuuS+kU6gCZJBF1qRAX0IEe 89I6ciEZXJ5c4qmREBt7LKZN8Qo7o7MlZl5TGtDMIQhqdViIXBpIh8NSBgWiVl21XT4C 2xOpT3UwUryT1s9iq2ePiv8rDGefAcFmy6epE266cdfLXUpE2b4TnLxwWFUrfZVas4Ot oPVlVvX4WOV48xOpJuhqagpXJvlyoSgqypQ8m41cx0g/1OCtEQjitJVRjtLXA2Luivsg DQuw== X-Forwarded-Encrypted: i=1; AJvYcCWY6VStCupFRmqH810xZZ7b6glNvRYfmNKzIEtYG9XhqwI64ycndWY6d/jr9HyrLrEOzGHDgh4QpveKgeE=@vger.kernel.org X-Gm-Message-State: AOJu0YyJMcAh+3nEwIdAn0waE1G2tPmtucU6OhIW4nd5WTT5mgd0oOfE Gk5sDY0n595kX9/xozHhGnKeC3j1YS8ouZHd3MwchWhtq86luDPL2VwvoNtXwBKcAVRX4LaxEcY huw0I1M8fkRV3tYGzrg== X-Google-Smtp-Source: AGHT+IGeb7tdJA1W7sb5dSgp9s5jXeeSfsgqexjzavxSHuLmRM8lNSVcHXeicPy55t7j+m49O+cEaQt/jdoneF9o X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:600c:601a:b0:42c:ae4e:a983 with SMTP id 5b1f17b1804b1-42cae4eae13mr197095e9.3.1726047063200; Wed, 11 Sep 2024 02:31:03 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:24 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-9-vdonnefort@google.com> Subject: [PATCH 08/13] KVM: arm64: Add hyp tracing to tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When running with KVM protected mode, the hypervisor is able to generate events into tracefs compatible ring-buffers. Plug those ring-buffers to tracefs. The interface is found in hyp/ and contains the same hierarchy as any host instances easing the support by existing user-space tools. This currently doesn't provide any event support which will come later. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 86a629aaf0a1..c5bbf6b087a0 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -28,6 +28,8 @@ kvm-y +=3D arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.= o \ kvm-$(CONFIG_HW_PERF_EVENTS) +=3D pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) +=3D pauth.o =20 +kvm-$(CONFIG_TRACING) +=3D hyp_trace.o + always-y :=3D hyp_constants.h hyp-constants.s =20 define rule_gen_hyp_constants diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9bef7638342e..444719b44f7a 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -25,6 +25,7 @@ =20 #define CREATE_TRACE_POINTS #include "trace_arm.h" +#include "hyp_trace.h" =20 #include #include @@ -2330,6 +2331,9 @@ static int __init init_subsystems(void) =20 kvm_register_perf_callbacks(NULL); =20 + err =3D hyp_trace_init_tracefs(); + if (err) + kvm_err("Failed to initialize Hyp tracing\n"); out: if (err) hyp_cpu_pm_exit(); diff --git a/arch/arm64/kvm/hyp/hyp-constants.c b/arch/arm64/kvm/hyp/hyp-co= nstants.c index b257a3b4bfc5..5c4a797a701f 100644 --- a/arch/arm64/kvm/hyp/hyp-constants.c +++ b/arch/arm64/kvm/hyp/hyp-constants.c @@ -3,11 +3,15 @@ #include #include #include +#include =20 int main(void) { DEFINE(STRUCT_HYP_PAGE_SIZE, sizeof(struct hyp_page)); DEFINE(PKVM_HYP_VM_SIZE, sizeof(struct pkvm_hyp_vm)); DEFINE(PKVM_HYP_VCPU_SIZE, sizeof(struct pkvm_hyp_vcpu)); +#ifdef CONFIG_TRACING + DEFINE(STRUCT_HYP_BUFFER_PAGE_SIZE, sizeof(struct hyp_buffer_page)); +#endif return 0; } diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c new file mode 100644 index 000000000000..b9d1f96d0678 --- /dev/null +++ b/arch/arm64/kvm/hyp_trace.c @@ -0,0 +1,664 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024 Google LLC + * Author: Vincent Donnefort + */ + +#include +#include +#include + +#include +#include + +#include "hyp_constants.h" +#include "hyp_trace.h" + +#define RB_POLL_MS 100 + +#define TRACEFS_DIR "hypervisor" +#define TRACEFS_MODE_WRITE 0640 +#define TRACEFS_MODE_READ 0440 + +static struct hyp_trace_buffer { + struct hyp_trace_desc *desc; + struct ring_buffer_writer writer; + struct trace_buffer *trace_buffer; + size_t desc_size; + bool tracing_on; + int nr_readers; + struct mutex lock; +} hyp_trace_buffer =3D { + .lock =3D __MUTEX_INITIALIZER(hyp_trace_buffer.lock), +}; + +static size_t hyp_trace_buffer_size =3D 7 << 10; + +/* Number of pages the ring-buffer requires to accommodate for size */ +#define NR_PAGES(size) \ + ((PAGE_ALIGN(size) >> PAGE_SHIFT) + 1) + +static inline bool hyp_trace_buffer_loaded(struct hyp_trace_buffer *hyp_bu= ffer) +{ + return !!hyp_buffer->trace_buffer; +} + +static inline bool hyp_trace_buffer_used(struct hyp_trace_buffer *hyp_buff= er) +{ + return hyp_buffer->nr_readers || hyp_buffer->tracing_on || + !ring_buffer_empty(hyp_buffer->trace_buffer); +} + +static int +bpage_backing_alloc(struct hyp_buffer_pages_backing *bpage_backing, size_t= size) +{ + size_t backing_size; + void *start; + + backing_size =3D PAGE_ALIGN(STRUCT_HYP_BUFFER_PAGE_SIZE * NR_PAGES(size) * + num_possible_cpus()); + + start =3D alloc_pages_exact(backing_size, GFP_KERNEL_ACCOUNT); + if (!start) + return -ENOMEM; + + bpage_backing->start =3D (unsigned long)start; + bpage_backing->size =3D backing_size; + + return 0; +} + +static void +bpage_backing_free(struct hyp_buffer_pages_backing *bpage_backing) +{ + free_pages_exact((void *)bpage_backing->start, bpage_backing->size); +} + +static int __get_reader_page(int cpu) +{ + return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); +} + +static void hyp_trace_free_pages(struct hyp_trace_desc *desc) +{ + struct rb_page_desc *rb_desc; + int cpu, id; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + free_page(rb_desc->meta_va); + for (id =3D 0; id < rb_desc->nr_page_va; id++) + free_page(rb_desc->page_va[id]); + } +} + +static int hyp_trace_alloc_pages(struct hyp_trace_desc *desc, size_t size) +{ + int err =3D 0, cpu, id, nr_pages =3D NR_PAGES(size); + struct trace_page_desc *trace_desc; + struct rb_page_desc *rb_desc; + + trace_desc =3D &desc->page_desc; + trace_desc->nr_cpus =3D 0; + + rb_desc =3D (struct rb_page_desc *)&trace_desc->__data[0]; + + for_each_possible_cpu(cpu) { + rb_desc->cpu =3D cpu; + rb_desc->nr_page_va =3D 0; + rb_desc->meta_va =3D (unsigned long)page_to_virt(alloc_page(GFP_KERNEL)); + if (!rb_desc->meta_va) { + err =3D -ENOMEM; + break; + } + for (id =3D 0; id < nr_pages; id++) { + rb_desc->page_va[id] =3D (unsigned long)page_to_virt(alloc_page(GFP_KER= NEL)); + if (!rb_desc->page_va[id]) { + err =3D -ENOMEM; + break; + } + rb_desc->nr_page_va++; + } + trace_desc->nr_cpus++; + rb_desc =3D __next_rb_page_desc(rb_desc); + } + + if (err) { + hyp_trace_free_pages(desc); + return err; + } + + return 0; +} + +static int __load_page(unsigned long va) +{ + return kvm_call_hyp_nvhe(__pkvm_host_share_hyp, virt_to_pfn((void *)va), = 1); +} + +static void __teardown_page(unsigned long va) +{ + WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, virt_to_pfn((void *)va= ), 1)); +} + +static void hyp_trace_teardown_pages(struct hyp_trace_desc *desc, + int last_cpu) +{ + struct rb_page_desc *rb_desc; + int cpu, id; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + if (cpu > last_cpu) + break; + __teardown_page(rb_desc->meta_va); + for (id =3D 0; id < rb_desc->nr_page_va; id++) + __teardown_page(rb_desc->page_va[id]); + } +} + +static int hyp_trace_load_pages(struct hyp_trace_desc *desc) +{ + int last_loaded_cpu =3D 0, cpu, id, err =3D -EINVAL; + struct rb_page_desc *rb_desc; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + err =3D __load_page(rb_desc->meta_va); + if (err) + break; + + for (id =3D 0; id < rb_desc->nr_page_va; id++) { + err =3D __load_page(rb_desc->page_va[id]); + if (err) + break; + } + + if (!err) + continue; + + for (id--; id >=3D 0; id--) + __teardown_page(rb_desc->page_va[id]); + + last_loaded_cpu =3D cpu - 1; + + break; + } + + if (!err) + return 0; + + hyp_trace_teardown_pages(desc, last_loaded_cpu); + + return err; +} + +static int hyp_trace_buffer_load(struct hyp_trace_buffer *hyp_buffer, size= _t size) +{ + int ret, nr_pages =3D NR_PAGES(size); + struct rb_page_desc *rbdesc; + struct hyp_trace_desc *desc; + size_t desc_size; + + if (hyp_trace_buffer_loaded(hyp_buffer)) + return 0; + + desc_size =3D size_add(offsetof(struct hyp_trace_desc, page_desc), + offsetof(struct trace_page_desc, __data)); + desc_size =3D size_add(desc_size, + size_mul(num_possible_cpus(), + struct_size(rbdesc, page_va, nr_pages))); + if (desc_size =3D=3D SIZE_MAX) + return -E2BIG; + + /* + * The hypervisor will unmap the descriptor from the host to protect the + * reading. Page granularity for the allocation ensures no other + * useful data will be unmapped. + */ + desc_size =3D PAGE_ALIGN(desc_size); + + desc =3D (struct hyp_trace_desc *)alloc_pages_exact(desc_size, GFP_KERNEL= ); + if (!desc) + return -ENOMEM; + + ret =3D hyp_trace_alloc_pages(desc, size); + if (ret) + goto err_free_desc; + + ret =3D bpage_backing_alloc(&desc->backing, size); + if (ret) + goto err_free_pages; + + ret =3D hyp_trace_load_pages(desc); + if (ret) + goto err_free_backing; + + ret =3D kvm_call_hyp_nvhe(__pkvm_load_tracing, (unsigned long)desc, desc_= size); + if (ret) + goto err_teardown_pages; + + hyp_buffer->writer.pdesc =3D &desc->page_desc; + hyp_buffer->writer.get_reader_page =3D __get_reader_page; + hyp_buffer->trace_buffer =3D ring_buffer_reader(&hyp_buffer->writer); + if (!hyp_buffer->trace_buffer) { + ret =3D -ENOMEM; + goto err_teardown_tracing; + } + + hyp_buffer->desc =3D desc; + hyp_buffer->desc_size =3D desc_size; + + return 0; + +err_teardown_tracing: + kvm_call_hyp_nvhe(__pkvm_teardown_tracing); +err_teardown_pages: + hyp_trace_teardown_pages(desc, INT_MAX); +err_free_backing: + bpage_backing_free(&desc->backing); +err_free_pages: + hyp_trace_free_pages(desc); +err_free_desc: + free_pages_exact(desc, desc_size); + + return ret; +} + +static void hyp_trace_buffer_teardown(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_desc *desc =3D hyp_buffer->desc; + size_t desc_size =3D hyp_buffer->desc_size; + + if (!hyp_trace_buffer_loaded(hyp_buffer)) + return; + + if (hyp_trace_buffer_used(hyp_buffer)) + return; + + if (kvm_call_hyp_nvhe(__pkvm_teardown_tracing)) + return; + + ring_buffer_free(hyp_buffer->trace_buffer); + hyp_trace_teardown_pages(desc, INT_MAX); + bpage_backing_free(&desc->backing); + hyp_trace_free_pages(desc); + free_pages_exact(desc, desc_size); + hyp_buffer->trace_buffer =3D NULL; +} + +static int hyp_trace_start(void) +{ + struct hyp_trace_buffer *hyp_buffer =3D &hyp_trace_buffer; + int ret =3D 0; + + mutex_lock(&hyp_buffer->lock); + + if (hyp_buffer->tracing_on) + goto out; + + ret =3D hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); + if (ret) + goto out; + + ret =3D kvm_call_hyp_nvhe(__pkvm_enable_tracing, true); + if (ret) { + hyp_trace_buffer_teardown(hyp_buffer); + goto out; + } + + hyp_buffer->tracing_on =3D true; + +out: + mutex_unlock(&hyp_buffer->lock); + + return ret; +} + +static void hyp_trace_stop(void) +{ + struct hyp_trace_buffer *hyp_buffer =3D &hyp_trace_buffer; + int ret; + + mutex_lock(&hyp_buffer->lock); + + if (!hyp_buffer->tracing_on) + goto end; + + ret =3D kvm_call_hyp_nvhe(__pkvm_enable_tracing, false); + if (!ret) { + ring_buffer_poll_writer(hyp_buffer->trace_buffer, + RING_BUFFER_ALL_CPUS); + hyp_buffer->tracing_on =3D false; + hyp_trace_buffer_teardown(hyp_buffer); + } + +end: + mutex_unlock(&hyp_buffer->lock); +} + +static ssize_t hyp_tracing_on(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + unsigned long val; + int ret; + + ret =3D kstrtoul_from_user(ubuf, cnt, 10, &val); + if (ret) + return ret; + + if (val) + ret =3D hyp_trace_start(); + else + hyp_trace_stop(); + + return ret ? ret : cnt; +} + +static ssize_t hyp_tracing_on_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[3]; + int r; + + mutex_lock(&hyp_trace_buffer.lock); + r =3D sprintf(buf, "%d\n", hyp_trace_buffer.tracing_on); + mutex_unlock(&hyp_trace_buffer.lock); + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); +} + +static const struct file_operations hyp_tracing_on_fops =3D { + .write =3D hyp_tracing_on, + .read =3D hyp_tracing_on_read, +}; + +static ssize_t hyp_buffer_size(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + unsigned long val; + int ret; + + ret =3D kstrtoul_from_user(ubuf, cnt, 10, &val); + if (ret) + return ret; + + if (!val) + return -EINVAL; + + mutex_lock(&hyp_trace_buffer.lock); + hyp_trace_buffer_size =3D val << 10; /* KB to B */ + mutex_unlock(&hyp_trace_buffer.lock); + + return cnt; +} + +static ssize_t hyp_buffer_size_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[64]; + int r; + + mutex_lock(&hyp_trace_buffer.lock); + r =3D sprintf(buf, "%lu (%s)\n", hyp_trace_buffer_size >> 10, + hyp_trace_buffer_loaded(&hyp_trace_buffer) ? + "loaded" : "unloaded"); + mutex_unlock(&hyp_trace_buffer.lock); + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); +} + +static const struct file_operations hyp_buffer_size_fops =3D { + .write =3D hyp_buffer_size, + .read =3D hyp_buffer_size_read, +}; + +static void ht_print_trace_time(struct ht_iterator *iter) +{ + unsigned long usecs_rem; + u64 ts_ns =3D iter->ts; + + do_div(ts_ns, 1000); + usecs_rem =3D do_div(ts_ns, USEC_PER_SEC); + + trace_seq_printf(&iter->seq, "%5lu.%06lu: ", + (unsigned long)ts_ns, usecs_rem); +} + +static void ht_print_trace_cpu(struct ht_iterator *iter) +{ + trace_seq_printf(&iter->seq, "[%03d]\t", iter->ent_cpu); +} + +static int ht_print_trace_fmt(struct ht_iterator *iter) +{ + if (iter->lost_events) + trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", + iter->ent_cpu, iter->lost_events); + + ht_print_trace_cpu(iter); + ht_print_trace_time(iter); + + return trace_seq_has_overflowed(&iter->seq) ? -EOVERFLOW : 0; +}; + +static struct ring_buffer_event *__ht_next_pipe_event(struct ht_iterator *= iter) +{ + struct ring_buffer_event *evt =3D NULL; + int cpu =3D iter->cpu; + + if (cpu !=3D RING_BUFFER_ALL_CPUS) { + if (ring_buffer_empty_cpu(iter->trace_buffer, cpu)) + return NULL; + + iter->ent_cpu =3D cpu; + + return ring_buffer_peek(iter->trace_buffer, cpu, &iter->ts, + &iter->lost_events); + } + + iter->ts =3D LLONG_MAX; + for_each_possible_cpu(cpu) { + struct ring_buffer_event *_evt; + unsigned long lost_events; + u64 ts; + + if (ring_buffer_empty_cpu(iter->trace_buffer, cpu)) + continue; + + _evt =3D ring_buffer_peek(iter->trace_buffer, cpu, &ts, + &lost_events); + if (!_evt) + continue; + + if (ts >=3D iter->ts) + continue; + + iter->ts =3D ts; + iter->ent_cpu =3D cpu; + iter->lost_events =3D lost_events; + evt =3D _evt; + } + + return evt; +} + +static void *ht_next_pipe_event(struct ht_iterator *iter) +{ + struct ring_buffer_event *event; + + event =3D __ht_next_pipe_event(iter); + if (!event) + return NULL; + + iter->ent =3D (struct hyp_entry_hdr *)&event->array[1]; + iter->ent_size =3D event->array[0]; + + return iter; +} + +static ssize_t +hyp_trace_pipe_read(struct file *file, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + struct ht_iterator *iter =3D (struct ht_iterator *)file->private_data; + int ret; + +copy_to_user: + ret =3D trace_seq_to_user(&iter->seq, ubuf, cnt); + if (ret !=3D -EBUSY) + return ret; + + trace_seq_init(&iter->seq); + + ret =3D ring_buffer_wait(iter->trace_buffer, iter->cpu, 0, NULL, NULL); + if (ret < 0) + return ret; + + while (ht_next_pipe_event(iter)) { + int prev_len =3D iter->seq.seq.len; + + if (ht_print_trace_fmt(iter)) { + iter->seq.seq.len =3D prev_len; + break; + } + + ring_buffer_consume(iter->trace_buffer, iter->ent_cpu, NULL, + NULL); + } + + goto copy_to_user; +} + +static void __poll_writer(struct work_struct *work) +{ + struct delayed_work *dwork =3D to_delayed_work(work); + struct ht_iterator *iter; + + iter =3D container_of(dwork, struct ht_iterator, poll_work); + + ring_buffer_poll_writer(iter->trace_buffer, iter->cpu); + + schedule_delayed_work((struct delayed_work *)work, + msecs_to_jiffies(RB_POLL_MS)); +} + +static int hyp_trace_pipe_open(struct inode *inode, struct file *file) +{ + struct hyp_trace_buffer *hyp_buffer =3D &hyp_trace_buffer; + int cpu =3D (s64)inode->i_private; + struct ht_iterator *iter =3D NULL; + int ret; + + mutex_lock(&hyp_buffer->lock); + + if (hyp_buffer->nr_readers =3D=3D INT_MAX) { + ret =3D -EBUSY; + goto unlock; + } + + ret =3D hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); + if (ret) + goto unlock; + + iter =3D kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) { + ret =3D -ENOMEM; + goto unlock; + } + iter->trace_buffer =3D hyp_buffer->trace_buffer; + iter->cpu =3D cpu; + trace_seq_init(&iter->seq); + file->private_data =3D iter; + + ret =3D ring_buffer_poll_writer(hyp_buffer->trace_buffer, cpu); + if (ret) + goto unlock; + + INIT_DELAYED_WORK(&iter->poll_work, __poll_writer); + schedule_delayed_work(&iter->poll_work, msecs_to_jiffies(RB_POLL_MS)); + + hyp_buffer->nr_readers++; + +unlock: + if (ret) { + hyp_trace_buffer_teardown(hyp_buffer); + kfree(iter); + } + + mutex_unlock(&hyp_buffer->lock); + + return ret; +} + +static int hyp_trace_pipe_release(struct inode *inode, struct file *file) +{ + struct hyp_trace_buffer *hyp_buffer =3D &hyp_trace_buffer; + struct ht_iterator *iter =3D file->private_data; + + cancel_delayed_work_sync(&iter->poll_work); + + mutex_lock(&hyp_buffer->lock); + + WARN_ON(--hyp_buffer->nr_readers < 0); + + hyp_trace_buffer_teardown(hyp_buffer); + + mutex_unlock(&hyp_buffer->lock); + + kfree(iter); + + return 0; +} + +static const struct file_operations hyp_trace_pipe_fops =3D { + .open =3D hyp_trace_pipe_open, + .read =3D hyp_trace_pipe_read, + .release =3D hyp_trace_pipe_release, + .llseek =3D no_llseek, +}; + +int hyp_trace_init_tracefs(void) +{ + struct dentry *root, *per_cpu_root; + char per_cpu_name[16]; + long cpu; + + if (!is_protected_kvm_enabled()) + return 0; + + root =3D tracefs_create_dir(TRACEFS_DIR, NULL); + if (!root) { + pr_err("Failed to create tracefs "TRACEFS_DIR"/\n"); + return -ENODEV; + } + + tracefs_create_file("tracing_on", TRACEFS_MODE_WRITE, root, NULL, + &hyp_tracing_on_fops); + + tracefs_create_file("buffer_size_kb", TRACEFS_MODE_WRITE, root, NULL, + &hyp_buffer_size_fops); + + tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, + (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); + + per_cpu_root =3D tracefs_create_dir("per_cpu", root); + if (!per_cpu_root) { + pr_err("Failed to create tracefs folder "TRACEFS_DIR"/per_cpu/\n"); + return -ENODEV; + } + + for_each_possible_cpu(cpu) { + struct dentry *per_cpu_dir; + + snprintf(per_cpu_name, sizeof(per_cpu_name), "cpu%ld", cpu); + per_cpu_dir =3D tracefs_create_dir(per_cpu_name, per_cpu_root); + if (!per_cpu_dir) { + pr_warn("Failed to create tracefs "TRACEFS_DIR"/per_cpu/cpu%ld\n", + cpu); + continue; + } + + tracefs_create_file("trace_pipe", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_pipe_fops); + } + + return 0; +} diff --git a/arch/arm64/kvm/hyp_trace.h b/arch/arm64/kvm/hyp_trace.h new file mode 100644 index 000000000000..14fc06c625a6 --- /dev/null +++ b/arch/arm64/kvm/hyp_trace.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ARM64_KVM_HYP_TRACE_H__ +#define __ARM64_KVM_HYP_TRACE_H__ + +#include +#include + +struct ht_iterator { + struct trace_buffer *trace_buffer; + int cpu; + struct hyp_entry_hdr *ent; + unsigned long lost_events; + int ent_cpu; + size_t ent_size; + u64 ts; + void *spare; + size_t copy_leftover; + struct trace_seq seq; + struct delayed_work poll_work; +}; + +#ifdef CONFIG_TRACING +int hyp_trace_init_tracefs(void); +#else +static inline int hyp_trace_init_tracefs(void) { return 0; } +#endif +#endif --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F16CF186617 for ; Wed, 11 Sep 2024 09:31:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047069; cv=none; b=t6WFG/OuFhQ1O03BHSeiivbQNaZCKmtTD4sMm/eFCK9J6XVGE9DtEbua+QvXF/R+nF+QBvyBm4L9DswdgNjNOeR4QGLkPjT0TjhIlaVKrK5cicpxUgKZg7uiDOGJtTNhbYSGVpar9M0Kt9Luua0weRpG8XPJNVVtLZEeG4nSsc0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047069; c=relaxed/simple; bh=LaenaLuGt4fQVIJCNLVWTUOkToNUk9LbTgeBs7hJkNM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=S/sq3PxnOSqwBMFpPVmTyrvJZaGQXHwWsZS7pUfNkyRuT/oRi6AnSPs7/zxd+EqE9QYOAE4KEzRlvFd9q6vDjIIIHhW/9Ufx02pA9A3WEsL/YW/T2j6USmSkPD2TtskzfiYA2hJhp+EdjUCxdjXqrVQF/ChEuG+qo6SWeDcERnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hME6aKA+; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hME6aKA+" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e1159fb161fso3102564276.1 for ; Wed, 11 Sep 2024 02:31:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047066; x=1726651866; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PzPcrbftESLspjTEpHjqRTEtmKwNQEBE1twldyLxH1Q=; b=hME6aKA+Olaq04UJSoADYBXF0mZZGtSBkhQWh+KPl4NSicxHAe3zb9rVwTyRJerTgk Qvmk0uWs/02RBwkOik00x+X1gMB2nRBBxsIFolsZQb5ABPWnc4tAlvzf7xNP7czuv0V2 8aePFvgo6BdbiWRtwuuAcaXm1INW1Vir/zq87256W83GkY1o+wvs3Bvk5/EkcCimBExd Y1W5cHRRtRH8VN8Y99DZ8LAsRQnrQlTvAaqE06b8K8P7os9xKw7b7hoQaByAPEC6/7i7 z0nLzJWZlXKLTZ4mtVmdGDNMA2EH/WM+lc6rAYauwzqa08IXTcJ/CvLzQowonpIj2YJn j4sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047066; x=1726651866; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PzPcrbftESLspjTEpHjqRTEtmKwNQEBE1twldyLxH1Q=; b=Vn8JKUrBWTqtr+IsFQ6Cnompbi+VeGaRLZnKBTbkiQYyYVShcUuogO00dptJPMV/Ve rN+38vB01/5lsEcgoOHQmMZFxEkiZB4K5RpJ5tsEUg+Wd6wQU1oa98nsrv7bpxTQvkXy m7iHKguU0n1GUD5SC7T1laMh4XzgeVKuNUAbqyP7vrt7T8hhri+vIHCwY+T1YOrs7bSy kLYzOMmx4MlUIz6tjCSXBtGXUTF1jRHwFY0UgBdRoNG82rJLHLGrVBHkRBU+ty03xRge x73XgpNES0Szlv9bOu6r9IgvC0aETiPMt0AOKssGfhFb1AHl0+EaQTvdgA3HMSWGDvY/ xAlg== X-Forwarded-Encrypted: i=1; AJvYcCXt+YJ54OBFF5hWHykfD/IhQYCfEZL0ScYMyNTgg6ya1HCJ6hndR+VmRgQYuMnSVOshPwQMTIiAqWso5YM=@vger.kernel.org X-Gm-Message-State: AOJu0YySFxiW1s59h+YT9yUdl2zm5dtvnqEV9w6/+L0BOcwIkS7dLUe2 xb4bFpASTpueRQcOJun/v2N61ciueCr4rJAN/25Ihs2/hwdwUvm+ia5qKvO5LKRjoYmElE3PI0G jlutiqjYOUS6501pWlg== X-Google-Smtp-Source: AGHT+IHuqlEQLP9ULLz/rIA3IuW5PevSvq8qj3//zsb2EYi01oSWwNwkd5ANLePPls9k6dmxcvjBnl0eQMrlIAlZ X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a25:7d42:0:b0:e11:7a38:8883 with SMTP id 3f1490d57ef6-e1d8c42386dmr2939276.7.1726047066017; Wed, 11 Sep 2024 02:31:06 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:25 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-10-vdonnefort@google.com> Subject: [PATCH 09/13] KVM: arm64: Add clock for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort , John Stultz , Thomas Gleixner , Stephen Boyd , "Christopher S. Hall" , Richard Cochran , Lakshmi Sowjanya D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Configure the hypervisor tracing clock before starting tracing. For tracing purpose, the boot clock is interesting as it doesn't stop on suspend. However, it is corrected on a regular basis, which implies we need to re-evaluate it every once in a while. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Christopher S. Hall Cc: Richard Cochran Cc: Lakshmi Sowjanya D Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index d549d7d491c3..96490f8c3ff2 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -79,6 +79,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_init_vm, __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm, + __KVM_HOST_SMCCC_FUNC___pkvm_update_clock_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/i= nclude/nvhe/trace.h index a7c0c73af0e5..df17683a3b12 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -16,6 +16,7 @@ struct hyp_buffer_page { void *tracing_reserve_entry(unsigned long length); void tracing_commit_entry(void); =20 +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 ep= och_cyc); int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); @@ -24,6 +25,8 @@ int __pkvm_swap_reader_tracing(unsigned int cpu); static inline void *tracing_reserve_entry(unsigned long length) { return N= ULL; } static inline void tracing_commit_entry(void) { } =20 +static inline +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 ep= och_cyc) { } static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_s= ize) { return -ENODEV; } static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index 1fb3391e122a..7f5c3e888960 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -374,6 +374,18 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_c= ontext *host_ctxt) cpu_reg(host_ctxt, 1) =3D __pkvm_teardown_vm(handle); } =20 +static void handle___pkvm_update_clock_tracing(struct kvm_cpu_context *hos= t_ctxt) +{ + DECLARE_REG(u32, mult, host_ctxt, 1); + DECLARE_REG(u32, shift, host_ctxt, 2); + DECLARE_REG(u64, epoch_ns, host_ctxt, 3); + DECLARE_REG(u64, epoch_cyc, host_ctxt, 4); + + __pkvm_update_clock_tracing(mult, shift, epoch_ns, epoch_cyc); + + cpu_reg(host_ctxt, 1) =3D 0; +} + static void handle___pkvm_load_tracing(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(unsigned long, desc_hva, host_ctxt, 1); @@ -435,6 +447,7 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__pkvm_init_vm), HANDLE_FUNC(__pkvm_init_vcpu), HANDLE_FUNC(__pkvm_teardown_vm), + HANDLE_FUNC(__pkvm_update_clock_tracing), HANDLE_FUNC(__pkvm_load_tracing), HANDLE_FUNC(__pkvm_teardown_tracing), HANDLE_FUNC(__pkvm_enable_tracing), diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trac= e.c index debb3ee7dd3a..022fe2e24f82 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -468,6 +468,21 @@ static void rb_teardown_bpage_backing(void) hyp_buffer_pages_backing.size =3D 0; } =20 +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 ep= och_cyc) +{ + int cpu; + + /* After this loop, all CPUs are observing the new bank... */ + for (cpu =3D 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer =3D per_cpu_ptr(&trace_rb, cpu); + + while (atomic_read(&cpu_buffer->status) =3D=3D HYP_RB_WRITING); + } + + /* ...we can now override the old one and swap. */ + trace_clock_update(mult, shift, epoch_ns, epoch_cyc); +} + int __pkvm_swap_reader_tracing(unsigned int cpu) { struct hyp_rb_per_cpu *cpu_buffer; diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index b9d1f96d0678..1720daeda8ae 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -16,10 +16,25 @@ =20 #define RB_POLL_MS 100 =20 +/* Same 10min used by clocksource when width is more than 32-bits */ +#define CLOCK_MAX_CONVERSION_S 600 +#define CLOCK_INIT_MS 100 +#define CLOCK_POLL_MS 500 + #define TRACEFS_DIR "hypervisor" #define TRACEFS_MODE_WRITE 0640 #define TRACEFS_MODE_READ 0440 =20 +struct hyp_trace_clock { + u64 cycles; + u64 max_delta; + u64 boot; + u32 mult; + u32 shift; + struct delayed_work work; + struct completion ready; +}; + static struct hyp_trace_buffer { struct hyp_trace_desc *desc; struct ring_buffer_writer writer; @@ -28,6 +43,7 @@ static struct hyp_trace_buffer { bool tracing_on; int nr_readers; struct mutex lock; + struct hyp_trace_clock clock; } hyp_trace_buffer =3D { .lock =3D __MUTEX_INITIALIZER(hyp_trace_buffer.lock), }; @@ -74,6 +90,107 @@ bpage_backing_free(struct hyp_buffer_pages_backing *bpa= ge_backing) free_pages_exact((void *)bpage_backing->start, bpage_backing->size); } =20 +static void __hyp_clock_work(struct work_struct *work) +{ + struct delayed_work *dwork =3D to_delayed_work(work); + struct hyp_trace_buffer *hyp_buffer; + struct hyp_trace_clock *hyp_clock; + struct system_time_snapshot snap; + u64 rate, delta_cycles; + u64 boot, delta_boot; + u64 err =3D 0; + + hyp_clock =3D container_of(dwork, struct hyp_trace_clock, work); + hyp_buffer =3D container_of(hyp_clock, struct hyp_trace_buffer, clock); + + ktime_get_snapshot(&snap); + boot =3D ktime_to_ns(snap.boot); + + delta_boot =3D boot - hyp_clock->boot; + delta_cycles =3D snap.cycles - hyp_clock->cycles; + + /* Compare hyp clock with the kernel boot clock */ + if (hyp_clock->mult) { + u64 cur =3D delta_cycles; + + cur *=3D hyp_clock->mult; + cur >>=3D hyp_clock->shift; + cur +=3D hyp_clock->boot; + + err =3D abs_diff(cur, boot); + + /* No deviation, only update epoch if necessary */ + if (!err) { + if (delta_cycles >=3D hyp_clock->max_delta) + goto update_hyp; + + goto resched; + } + + /* Warn if the error is above tracing precision (1us) */ + if (hyp_buffer->tracing_on && err > NSEC_PER_USEC) + pr_warn_ratelimited("hyp trace clock off by %lluus\n", + err / NSEC_PER_USEC); + } + + if (delta_boot > U32_MAX) { + do_div(delta_boot, NSEC_PER_SEC); + rate =3D delta_cycles; + } else { + rate =3D delta_cycles * NSEC_PER_SEC; + } + + do_div(rate, delta_boot); + + clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift, + rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S); + +update_hyp: + hyp_clock->max_delta =3D (U64_MAX / hyp_clock->mult) >> 1; + hyp_clock->cycles =3D snap.cycles; + hyp_clock->boot =3D boot; + kvm_call_hyp_nvhe(__pkvm_update_clock_tracing, hyp_clock->mult, + hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles); + complete(&hyp_clock->ready); + + pr_debug("hyp trace clock update mult=3D%u shift=3D%u max_delta=3D%llu er= r=3D%llu\n", + hyp_clock->mult, hyp_clock->shift, hyp_clock->max_delta, err); + +resched: + schedule_delayed_work(&hyp_clock->work, + msecs_to_jiffies(CLOCK_POLL_MS)); +} + +static void hyp_clock_start(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock =3D &hyp_buffer->clock; + struct system_time_snapshot snap; + + ktime_get_snapshot(&snap); + + hyp_clock->boot =3D ktime_to_ns(snap.boot); + hyp_clock->cycles =3D snap.cycles; + hyp_clock->mult =3D 0; + + init_completion(&hyp_clock->ready); + INIT_DELAYED_WORK(&hyp_clock->work, __hyp_clock_work); + schedule_delayed_work(&hyp_clock->work, msecs_to_jiffies(CLOCK_INIT_MS)); +} + +static void hyp_clock_stop(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock =3D &hyp_buffer->clock; + + cancel_delayed_work_sync(&hyp_clock->work); +} + +static void hyp_clock_wait(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock =3D &hyp_buffer->clock; + + wait_for_completion(&hyp_clock->ready); +} + static int __get_reader_page(int cpu) { return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); @@ -294,10 +411,14 @@ static int hyp_trace_start(void) if (hyp_buffer->tracing_on) goto out; =20 + hyp_clock_start(hyp_buffer); + ret =3D hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); if (ret) goto out; =20 + hyp_clock_wait(hyp_buffer); + ret =3D kvm_call_hyp_nvhe(__pkvm_enable_tracing, true); if (ret) { hyp_trace_buffer_teardown(hyp_buffer); @@ -307,6 +428,9 @@ static int hyp_trace_start(void) hyp_buffer->tracing_on =3D true; =20 out: + if (!hyp_buffer->tracing_on) + hyp_clock_stop(hyp_buffer); + mutex_unlock(&hyp_buffer->lock); =20 return ret; @@ -324,6 +448,7 @@ static void hyp_trace_stop(void) =20 ret =3D kvm_call_hyp_nvhe(__pkvm_enable_tracing, false); if (!ret) { + hyp_clock_stop(hyp_buffer); ring_buffer_poll_writer(hyp_buffer->trace_buffer, RING_BUFFER_ALL_CPUS); hyp_buffer->tracing_on =3D false; @@ -615,6 +740,14 @@ static const struct file_operations hyp_trace_pipe_fop= s =3D { .llseek =3D no_llseek, }; =20 +static int hyp_trace_clock_show(struct seq_file *m, void *v) +{ + seq_puts(m, "[boot]\n"); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(hyp_trace_clock); + int hyp_trace_init_tracefs(void) { struct dentry *root, *per_cpu_root; @@ -639,6 +772,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); =20 + tracefs_create_file("trace_clock", TRACEFS_MODE_READ, root, NULL, + &hyp_trace_clock_fops); + per_cpu_root =3D tracefs_create_dir("per_cpu", root); if (!per_cpu_root) { pr_err("Failed to create tracefs folder "TRACEFS_DIR"/per_cpu/\n"); --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 194BE17E005 for ; Wed, 11 Sep 2024 09:31:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047070; cv=none; b=mnDN3PMjN0Zwy8ahTTB8ZDx6REb6V+QTDupqrwES6rLDd4KzQv9KxVbeboCoZec8ckgOL5KfYSvCHjcCGz0pY785zp4VwtcudCcVxR4DXTZHWMVnweUZ14sk67HYjsmerlJmv1VDSpiLUwJzzpTbk0nTW3daCtoQZqsrJi27hHc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047070; c=relaxed/simple; bh=+C1Mr6Eih+8GBOgaIA5ob9ph+NaEjJpwr6bdsWTV2h4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=kztnGNsqrpKpbKQvQZLyXX3sKaOsb9cEz5Q5PRf6tubPPr/+qaiBb676SEcN2u0ruqmigv6XaIabPO82BYaK8cs5Sku8XcPHNrK3q9UbbNnbU9RFaVGrwGTcoA7cvZJTqbocb51bpikCZfR2qbR76Q/Ix4qiZBzPr2ioUV7Zt+U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=GlRSAQrM; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GlRSAQrM" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6d73dd8ac65so187744977b3.1 for ; Wed, 11 Sep 2024 02:31:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047068; x=1726651868; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hy3Znb+Dwqyrq2cJ7HorL7uStGBaHWkuWbAR2JZBqCQ=; b=GlRSAQrMLl9xezMp6rRpHOu9JtlWFUzKoSiibUpP7rGvE2SZVIgQ7KAQy6m+brYwFw ZqU6VkcOZPsY/YYDAWxUfrZF/ydUPNdk11IZfgWYcoW4/suK/moxAOoSHolU7Bs5GWAf 1aqKvP4ODFdzvqEShL9ZYlfWgfkPDFY/yTtMf/TqhI7QsHCn3UH4UkTot3uC/LvQhJdJ SkoDnaSwM62L9hlZaqOU+ql+uEpsDgybU0nawe+QhZIT/rsh9uHSZNVrs57hIhVLe44R 3gEQuO4wA71uJegylG8Uotd4VivPO/VgceYvbRmAztr+1/P51tl3OlhhSC+cdyNFLRxa +ZaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047068; x=1726651868; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hy3Znb+Dwqyrq2cJ7HorL7uStGBaHWkuWbAR2JZBqCQ=; b=FOPKZqjI36DnCSBFcQtTF+pMy0Dd/Nq+qKLk0jM6OT1YdHQ2eVnSlhq5X2oi3lcERn tIrb6gxnm2XSFEdf3jq3FuaedCk2mT22tgOcHopD5NHhLcuFpHyHBJXfncibOg1Kn+bE 5KdDkcnmGnVNonjw6rVDmnY5Dq1BjOPYo9/bmuLBOL5StgMotthkC2Oh8+gRQYXEM/cj CoZs9Bt1t6WXV629fNQDVph6aUtTLEePHNFbkwyKyO9krGW5CHwdXczj3sduLJWTVyOD wXm2GTTEl1rmQWRXarvnegQMCvTCysEnI5K1eb00/gLiA81c1M5y36RcChMWhRCwXxKI AdhA== X-Forwarded-Encrypted: i=1; AJvYcCU6Kbcastc4rQtokzxoSvK9cFBfFBAc4HRzbJGdPb2rG5OsRgl7Wwo27os01G22h21x6bRzjdDDE56KmCo=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5HPkJsMM4Dxkus7FAxr7E9ls/mhq6q5JyM7DepcObF9PjkAZG Eq6CQmAKHyM+Ph556uCUmHPh+G6KR+ZHZGY+T7ASrPPlDnyjWBxq7IjEsKrrufs3cpaVhjTiU1A s1nflyP0Yd2lZXp7lMg== X-Google-Smtp-Source: AGHT+IFSAGx//hRZK4AAM97oklvEIfq79G7ievWf/yDgvOrEXJqZ7cy+6snJMt8gT5Wm1/gNBfGcYxIfKSWjx/Ue X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:690c:2f89:b0:6be:9d4a:f097 with SMTP id 00721157ae682-6db452739d0mr2138547b3.7.1726047068233; Wed, 11 Sep 2024 02:31:08 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:26 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-11-vdonnefort@google.com> Subject: [PATCH 10/13] KVM: arm64: Add raw interface for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The raw interface enables userspace tools such as trace-cmd to directly read the ring-buffer without any decoding by the kernel. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 1720daeda8ae..0d0e5eada816 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -740,6 +740,86 @@ static const struct file_operations hyp_trace_pipe_fop= s =3D { .llseek =3D no_llseek, }; =20 +static ssize_t +hyp_trace_raw_read(struct file *file, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + struct ht_iterator *iter =3D (struct ht_iterator *)file->private_data; + size_t size; + int ret; + + if (iter->copy_leftover) + goto read; + +again: + ret =3D ring_buffer_read_page(iter->trace_buffer, + (struct buffer_data_read_page *)iter->spare, + cnt, iter->cpu, 0); + if (ret < 0) { + if (!ring_buffer_empty_cpu(iter->trace_buffer, iter->cpu)) + return 0; + + ret =3D ring_buffer_wait(iter->trace_buffer, iter->cpu, 0, NULL, + NULL); + if (ret < 0) + return ret; + + goto again; + } + + iter->copy_leftover =3D 0; + +read: + size =3D PAGE_SIZE - iter->copy_leftover; + if (size > cnt) + size =3D cnt; + + ret =3D copy_to_user(ubuf, iter->spare + PAGE_SIZE - size, size); + if (ret =3D=3D size) + return -EFAULT; + + size -=3D ret; + *ppos +=3D size; + iter->copy_leftover =3D ret; + + return size; +} + +static int hyp_trace_raw_open(struct inode *inode, struct file *file) +{ + int ret =3D hyp_trace_pipe_open(inode, file); + struct ht_iterator *iter; + + if (ret) + return ret; + + iter =3D file->private_data; + iter->spare =3D ring_buffer_alloc_read_page(iter->trace_buffer, iter->cpu= ); + if (IS_ERR(iter->spare)) { + ret =3D PTR_ERR(iter->spare); + iter->spare =3D NULL; + return ret; + } + + return 0; +} + +static int hyp_trace_raw_release(struct inode *inode, struct file *file) +{ + struct ht_iterator *iter =3D file->private_data; + + ring_buffer_free_read_page(iter->trace_buffer, iter->cpu, iter->spare); + + return hyp_trace_pipe_release(inode, file); +} + +static const struct file_operations hyp_trace_raw_fops =3D { + .open =3D hyp_trace_raw_open, + .read =3D hyp_trace_raw_read, + .release =3D hyp_trace_raw_release, + .llseek =3D no_llseek, +}; + static int hyp_trace_clock_show(struct seq_file *m, void *v) { seq_puts(m, "[boot]\n"); @@ -794,6 +874,9 @@ int hyp_trace_init_tracefs(void) =20 tracefs_create_file("trace_pipe", TRACEFS_MODE_READ, per_cpu_dir, (void *)cpu, &hyp_trace_pipe_fops); + + tracefs_create_file("trace_pipe_raw", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_pipe_fops); } =20 return 0; --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF9A7187352 for ; Wed, 11 Sep 2024 09:31:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047073; cv=none; b=ILOY3ok91VvBj3V8Po5PI43CqNlDVITxhE++oQg1K6As4JcyZSXZpnsD/AAfd76pq8zXVAIQVbSZcmz0ZRST/gK9qT0RBs6ycj3GjjScmzjIH/zlB1dQwzCdg3AQAOokWb9dG+TmKquDNHE4dQ0nGFOfSA4igm2asO7ckbeFC5k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047073; c=relaxed/simple; bh=Hp4qstVpdpGnkkfN1xgcPm4KsX6RZwEnvPl5FeeL/2Y=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BzlKL6pCsTmt/W0AeEeuJ5+IF79cGS8H23MR1KFZUtO7oxyQPeDOLWrU75dfDgDSbEBh9kah2cDcxAi6ggBLMB8XQKzP8aY+5TPpgYlRu3kNb1BVvgkWB1iWUp/zR/yFtbiZDFX3He7WSrwqGheGBxAF+2/23gD+WtTStjdbbmI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qzxw4kSN; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qzxw4kSN" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-42cb857fc7dso24125455e9.0 for ; Wed, 11 Sep 2024 02:31:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047070; x=1726651870; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TnLSQZSAgoNi0Keo+sYtcTPVeJagPuiIv9Z2VTSsQS0=; b=qzxw4kSNgdHBf+iKuwzazzCTYqNs6nHOdVXBjjx/wn7C+7ZrGaVj7ZJn62F0HR44Fq bpeQEkWIPCimNfPJ7eK4xENdmtHz0K8zLoPije6TqyqqRs519mqV0dPk4DWfFzkeGVxJ Hhk7+MGxiYry3fNrNrufg+MRSiXRjPTOoi7jtcS+XlcqU83+gSlqnuHQyuiHoRXcYjkj mv3GkMyHtondICME0GbmI2O6522a16cluEOsLCJbescKgO/jFlqDN6ivY+HK9qDIwQc3 OxejYWvS5Stx2jwuQ/YCW4lu0B7Ri//yLDtV2QxHVXhRJ6WhG3fTsROR+YzoOBC+yKT6 hpLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047070; x=1726651870; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TnLSQZSAgoNi0Keo+sYtcTPVeJagPuiIv9Z2VTSsQS0=; b=q8nASvrs9ur5wUZ58/P0Y2KrLNzPWoxTOtqubeXB2r9rjvZAAIW6Ax2rpyyHtdSdWl 6dDgTndlIBjbPo4kWYx2LUakjAEih8yo37jJEzB1pTfmzaOLGe+EW3/XXAWHAJPM2tus Sv4p14qt+PgSVutZ+bYZ0fbH+Dpm606/9yXy3x62PY7ndNA4rbG+J1pim26mdELFuF2u Eu9UAY3dKdOziVeD17YKj/yXBwdoaTV8yMVtynO6qF36hbw3aS4I0GOzK+EfIxjqUq4C KuhuRs2okrMh2MPCp4icyao9xu5p4i/YVhT9k+KpL5c9d9WV2uGrDHcy1lj3rZjg9FAZ NRuA== X-Forwarded-Encrypted: i=1; AJvYcCXCN4ncj4lEz5ujBZhrDi+cHmaGhRkv8tHzzHi3b5cljcirNU9aI1Jspfn3qV4GFfJm2dAOvZh2U7jbS5g=@vger.kernel.org X-Gm-Message-State: AOJu0YyXijBKYpJegtLjFTGgJ5OR7FqB8F0wfxDkq0W+zJ5kPOTozRq9 K7HmEYKdX4zmPem114gbzKC+tuyqrrAKJ5oyUealfz6i3rOPMn77edwlHyXaCz5MqDdetOu4xKV l1X7ibspjiB7032NKKw== X-Google-Smtp-Source: AGHT+IFBFa1ITmFWdn1hIxeejWg2oOrSElTmg674ffGcdfONVY7m5AHs9mdqHrVmIGHjb5YdccfyU5goBnO2ORC/ X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:600c:1da3:b0:42c:afd6:6896 with SMTP id 5b1f17b1804b1-42cafd6692dmr355355e9.4.1726047070522; Wed, 11 Sep 2024 02:31:10 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:27 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-12-vdonnefort@google.com> Subject: [PATCH 11/13] KVM: arm64: Add trace interface for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The trace interface is solely here to reset tracing. Non-consuming read is not yet supported due to the lack of support in the ring-buffer meta page. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index 96490f8c3ff2..17896e6ceca7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -83,6 +83,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, }; =20 diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/i= nclude/nvhe/trace.h index df17683a3b12..1004e1edf24f 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -20,6 +20,7 @@ void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64= epoch_ns, u64 epoch_cy int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); +int __pkvm_reset_tracing(unsigned int cpu); int __pkvm_swap_reader_tracing(unsigned int cpu); #else static inline void *tracing_reserve_entry(unsigned long length) { return N= ULL; } @@ -30,6 +31,7 @@ void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64= epoch_ns, u64 epoch_cy static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_s= ize) { return -ENODEV; } static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } +static inline int __pkvm_reset_tracing(unsigned int cpu) { return -ENODEV;= } static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -E= NODEV; } #endif #endif diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index 7f5c3e888960..dc7a85922117 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -408,6 +408,13 @@ static void handle___pkvm_enable_tracing(struct kvm_cp= u_context *host_ctxt) cpu_reg(host_ctxt, 1) =3D __pkvm_enable_tracing(enable); } =20 +static void handle___pkvm_reset_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned int, cpu, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) =3D __pkvm_reset_tracing(cpu); +} + static void handle___pkvm_swap_reader_tracing(struct kvm_cpu_context *host= _ctxt) { DECLARE_REG(unsigned int, cpu, host_ctxt, 1); @@ -451,6 +458,7 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__pkvm_load_tracing), HANDLE_FUNC(__pkvm_teardown_tracing), HANDLE_FUNC(__pkvm_enable_tracing), + HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), }; =20 diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trac= e.c index 022fe2e24f82..6ea0f1d475bb 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -284,12 +284,20 @@ static int rb_page_init(struct hyp_buffer_page *bpage= , unsigned long hva) return 0; } =20 +static void rb_page_reset(struct hyp_buffer_page *bpage) +{ + bpage->write =3D 0; + bpage->entries =3D 0; + + local_set(&bpage->page->commit, 0); +} + static bool rb_cpu_loaded(struct hyp_rb_per_cpu *cpu_buffer) { return !!cpu_buffer->bpages; } =20 -static void rb_cpu_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) +static int rb_cpu_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) { int prev_status; =20 @@ -299,6 +307,8 @@ static void rb_cpu_disable_writing(struct hyp_rb_per_cp= u *cpu_buffer) HYP_RB_READY, HYP_RB_UNAVAILABLE); } while (prev_status =3D=3D HYP_RB_WRITING); + + return prev_status; } =20 static int rb_cpu_enable_writing(struct hyp_rb_per_cpu *cpu_buffer) @@ -311,6 +321,38 @@ static int rb_cpu_enable_writing(struct hyp_rb_per_cpu= *cpu_buffer) return 0; } =20 +static int rb_cpu_reset(struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *bpage; + int prev_status; + + if (!rb_cpu_loaded(cpu_buffer)) + return -ENODEV; + + prev_status =3D rb_cpu_disable_writing(cpu_buffer); + + bpage =3D cpu_buffer->head_page; + do { + rb_page_reset(bpage); + bpage =3D rb_next_page(bpage); + } while (bpage !=3D cpu_buffer->head_page); + + rb_page_reset(cpu_buffer->reader_page); + + cpu_buffer->meta->reader.read =3D 0; + cpu_buffer->meta->reader.lost_events =3D 0; + cpu_buffer->meta->entries =3D 0; + cpu_buffer->meta->overrun =3D 0; + cpu_buffer->meta->read =3D 0; + meta_pages_lost(cpu_buffer->meta) =3D 0; + meta_pages_touched(cpu_buffer->meta) =3D 0; + + if (prev_status =3D=3D HYP_RB_READY) + rb_cpu_enable_writing(cpu_buffer); + + return 0; +} + static void rb_cpu_teardown(struct hyp_rb_per_cpu *cpu_buffer) { int i; @@ -602,3 +644,17 @@ int __pkvm_enable_tracing(bool enable) =20 return ret; } + +int __pkvm_reset_tracing(unsigned int cpu) +{ + int ret =3D 0; + + if (cpu >=3D hyp_nr_cpus) + return -EINVAL; + + hyp_spin_lock(&trace_rb_lock); + ret =3D rb_cpu_reset(per_cpu_ptr(&trace_rb, cpu)); + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 0d0e5eada816..8ac8f9763cbd 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -196,6 +196,11 @@ static int __get_reader_page(int cpu) return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); } =20 +static int __reset(int cpu) +{ + return kvm_call_hyp_nvhe(__pkvm_reset_tracing, cpu); +} + static void hyp_trace_free_pages(struct hyp_trace_desc *desc) { struct rb_page_desc *rb_desc; @@ -354,6 +359,7 @@ static int hyp_trace_buffer_load(struct hyp_trace_buffe= r *hyp_buffer, size_t siz =20 hyp_buffer->writer.pdesc =3D &desc->page_desc; hyp_buffer->writer.get_reader_page =3D __get_reader_page; + hyp_buffer->writer.reset =3D __reset; hyp_buffer->trace_buffer =3D ring_buffer_reader(&hyp_buffer->writer); if (!hyp_buffer->trace_buffer) { ret =3D -ENOMEM; @@ -820,6 +826,49 @@ static const struct file_operations hyp_trace_raw_fops= =3D { .llseek =3D no_llseek, }; =20 +static void hyp_trace_reset(int cpu) +{ + struct hyp_trace_buffer *hyp_buffer =3D &hyp_trace_buffer; + + mutex_lock(&hyp_buffer->lock); + + if (!hyp_trace_buffer_loaded(hyp_buffer)) + goto out; + + if (cpu =3D=3D RING_BUFFER_ALL_CPUS) + ring_buffer_reset(hyp_buffer->trace_buffer); + else + ring_buffer_reset_cpu(hyp_buffer->trace_buffer, cpu); + +out: + mutex_unlock(&hyp_buffer->lock); +} + +static int hyp_trace_open(struct inode *inode, struct file *file) +{ + int cpu =3D (s64)inode->i_private; + + if (file->f_mode & FMODE_WRITE) { + hyp_trace_reset(cpu); + + return 0; + } + + return -EPERM; +} + +static ssize_t hyp_trace_write(struct file *filp, const char __user *ubuf, + size_t count, loff_t *ppos) +{ + return count; +} + +static const struct file_operations hyp_trace_fops =3D { + .open =3D hyp_trace_open, + .write =3D hyp_trace_write, + .release =3D NULL, +}; + static int hyp_trace_clock_show(struct seq_file *m, void *v) { seq_puts(m, "[boot]\n"); @@ -852,6 +901,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); =20 + tracefs_create_file("trace", TRACEFS_MODE_WRITE, root, + (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_fops); + tracefs_create_file("trace_clock", TRACEFS_MODE_READ, root, NULL, &hyp_trace_clock_fops); =20 @@ -877,6 +929,9 @@ int hyp_trace_init_tracefs(void) =20 tracefs_create_file("trace_pipe_raw", TRACEFS_MODE_READ, per_cpu_dir, (void *)cpu, &hyp_trace_pipe_fops); + + tracefs_create_file("trace", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_fops); } =20 return 0; --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E7D4187349 for ; Wed, 11 Sep 2024 09:31:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047077; cv=none; b=lwFw+X8Opm17kuiBcHcJYNqCCnaRaT67XfDQCwTvtnnowC+1B7T4PMQOdhdmD3gyWHMNBqcZXrnMcJmbGxnaLhAQUZq3ya7ihdxzYYDMa8oFfB8IOFnQ7VICd0r+64wYDjxGgOrhomw5+lSWewbxxfdaLJzfbnDgRzj6pKpVCgc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047077; c=relaxed/simple; bh=5AZkQvYm1G239y94VI0y1kJORDafVBeVg+TEGSz05yc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pQ0VfNQSzX//52GF1MKTz9W+DhuTyUOE4swtxDkOfEaXX8vUeZd1cVyUb4PFvGWXF1W4FwaT8HQN4EMNLFqWrGNZF+F73A2c80AdB4lZcOxUdetFBc0UKgDJwu3xgAdfqSMBwB45P72dX72U7ZgzAdsrmNG4YbbvTsbEDK72cn8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=t5zzzaBH; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="t5zzzaBH" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-42cb479fab2so3427615e9.1 for ; Wed, 11 Sep 2024 02:31:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047072; x=1726651872; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dfc4OfIHr7h2o5d2tTvf2Hj1VomewFNKCCKe0Y2qZjw=; b=t5zzzaBHMkcMYo2WpqtUGfsirxa9kb5mnCSsmguQTQgycJzbmqyKjQWLpVELhojlmK y5S55HeJB2TIEV8Xo2fdhJ6iMl7F5ps9NLwpIwPBXaY7aMCh34hyU3kUtjE4B0sMJgU3 fkLPp54EoKyBD+yko1m6xnbYZy0wILvUQILXH++lfmXgsNY/4WgPmi11KyDNZ36G7beN KynEA7R91t4n2zIBNoQSFeguZQeRIuhAUTUHMjC4R+4ZbJH/SyZU/22oMnAazI6OZ1UC JFsDKnbZJQspwFuC0CUNYUu/oh6dGvJQ+cAgGXuyJT/TpPfX9lIWeZ2LOLYpELUfB8FW bYqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047072; x=1726651872; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dfc4OfIHr7h2o5d2tTvf2Hj1VomewFNKCCKe0Y2qZjw=; b=t4QNg4cPGN+vpkoO3/TmWF39E/BEvp47Ix4hqCptVrIvsMprwMlOjK40FMl30relSD 1og0yT512YDszsqu3w5513e4bZGKTIsCFNCg7yoUp1r3HJ06qFXokH8K9kXORi+obxs5 POg41uX/HW7F7g0Y340qtHaOhAWlcfiRR+1WpUmxX2x4vMUfUSHrX+pgMKYzeymWQaF8 MMfKwf2Y2gq3YnInB5h2aaBuDBjG69YQWV1sNTGH72Z/qGu2JyEjbcEzxmoMWuODtFBp LhHSCpPvTVg8jg0fapJ10AUevifIxTxMI136ajCDQlaoGvuY756w6WSMcuy+2/JDL3CD 6LZQ== X-Forwarded-Encrypted: i=1; AJvYcCXmYIIP0fMAXHh7zVzc7a/q+rdmDDwNNAyMWiZ9z2OH/tTUlw6kSX13ry2iVFxCE/fywoWgkeHFuCwXL+w=@vger.kernel.org X-Gm-Message-State: AOJu0YwBO7xDtB3oFl6Uk1nehUL6MXmldci02cJKtnLGOxF8fgQBZuqq 9u7Fs6SifCe7jGIMUj30uLsdp3k30TsA8d1idExwIDAbHCqhFQ8QsIiFNTHV+RG7mjMvLvQBRba DB0x5jfE0Eo1DAnCwjQ== X-Google-Smtp-Source: AGHT+IER8kn4tcalkxz3jNwch98JUwwTqXYt2X/hFYU1B3uLuXKfkX0wsXmQkGyrMfui7fKPFbuYhEiizTfmgQxA X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:600c:3b27:b0:42c:b844:fcee with SMTP id 5b1f17b1804b1-42cbdb8164emr2830315e9.0.1726047072603; Wed, 11 Sep 2024 02:31:12 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:28 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-13-vdonnefort@google.com> Subject: [PATCH 12/13] KVM: arm64: Add support for hyp events From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Following the introduction of hyp tracing for pKVM, add the ability to describe and emit events into the hypervisor ring-buffers. Hypervisor events are declared into kvm_hypevents.h and can be called with trace_() in a similar fashion to the kernel tracefs events. hyp_enter and hyp_exit events are provided as an example. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index 17896e6ceca7..3710deb6eaa0 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -85,6 +85,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_enable_event, }; =20 #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_define_hypevents.h b/arch/arm64/inc= lude/asm/kvm_define_hypevents.h new file mode 100644 index 000000000000..efa2c2cb3ef2 --- /dev/null +++ b/arch/arm64/include/asm/kvm_define_hypevents.h @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include + +#include +#include + +#ifndef HYP_EVENT_FILE +# undef __ARM64_KVM_HYPEVENTS_H_ +# define __HYP_EVENT_FILE +#else +# define __HYP_EVENT_FILE __stringify(HYP_EVENT_FILE) +#endif + +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + HYP_EVENT_FORMAT(__name, __struct); \ + static void hyp_event_trace_##__name(struct ht_iterator *iter) \ + { \ + struct trace_hyp_format_##__name __maybe_unused *__entry =3D \ + (struct trace_hyp_format_##__name *)iter->ent; \ + trace_seq_puts(&iter->seq, #__name); \ + trace_seq_putc(&iter->seq, ' '); \ + trace_seq_printf(&iter->seq, __printk); \ + trace_seq_putc(&iter->seq, '\n'); \ + } +#define HYP_EVENT_MULTI_READ +#include __HYP_EVENT_FILE + +#undef he_field +#define he_field(_type, _item) \ + { \ + .type =3D #_type, .name =3D #_item, \ + .size =3D sizeof(_type), .align =3D __alignof__(_type), \ + .is_signed =3D is_signed_type(_type), \ + }, +#undef HYP_EVENT +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static struct trace_event_fields hyp_event_fields_##__name[] =3D { \ + __struct \ + {} \ + }; +#include __HYP_EVENT_FILE + +#undef HYP_EVENT +#undef HE_PRINTK +#define __entry REC +#define HE_PRINTK(fmt, args...) "\"" fmt "\", " __stringify(args) +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static char hyp_event_print_fmt_##__name[] =3D __printk; \ + static bool hyp_event_enabled_##__name; \ + struct hyp_event __section("_hyp_events") hyp_event_##__name =3D {\ + .name =3D #__name, \ + .enabled =3D &hyp_event_enabled_##__name, \ + .fields =3D hyp_event_fields_##__name, \ + .print_fmt =3D hyp_event_print_fmt_##__name, \ + .trace_func =3D hyp_event_trace_##__name, \ + } +#include __HYP_EVENT_FILE + +#undef HYP_EVENT_MULTI_READ diff --git a/arch/arm64/include/asm/kvm_hypevents.h b/arch/arm64/include/as= m/kvm_hypevents.h new file mode 100644 index 000000000000..0b98a87a1250 --- /dev/null +++ b/arch/arm64/include/asm/kvm_hypevents.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#if !defined(__ARM64_KVM_HYPEVENTS_H_) || defined(HYP_EVENT_MULTI_READ) +#define __ARM64_KVM_HYPEVENTS_H_ + +#ifdef __KVM_NVHE_HYPERVISOR__ +#include +#endif + +/* + * Hypervisor events definitions. + */ + +HYP_EVENT(hyp_enter, + HE_PROTO(void), + HE_STRUCT( + ), + HE_ASSIGN( + ), + HE_PRINTK(" ") +); + +HYP_EVENT(hyp_exit, + HE_PROTO(void), + HE_STRUCT( + ), + HE_ASSIGN( + ), + HE_PRINTK(" ") +); +#endif diff --git a/arch/arm64/include/asm/kvm_hypevents_defs.h b/arch/arm64/inclu= de/asm/kvm_hypevents_defs.h new file mode 100644 index 000000000000..473bf4363d82 --- /dev/null +++ b/arch/arm64/include/asm/kvm_hypevents_defs.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ARM64_KVM_HYPEVENTS_DEFS_H +#define __ARM64_KVM_HYPEVENTS_DEFS_H + +struct hyp_event_id { + unsigned short id; + void *data; +}; + +#define HYP_EVENT_NAME_MAX 32 + +struct hyp_event { + char name[HYP_EVENT_NAME_MAX]; + bool *enabled; + char *print_fmt; + struct trace_event_fields *fields; + void (*trace_func)(struct ht_iterator *iter); + int id; +}; + +struct hyp_entry_hdr { + unsigned short id; +}; + +/* + * Hyp events definitions common to the hyp and the host + */ +#define HYP_EVENT_FORMAT(__name, __struct) \ + struct __packed trace_hyp_format_##__name { \ + struct hyp_entry_hdr hdr; \ + __struct \ + } + +#define HE_PROTO(args...) args +#define HE_STRUCT(args...) args +#define HE_ASSIGN(args...) args +#define HE_PRINTK(args...) args + +#define he_field(type, item) type item; +#endif diff --git a/arch/arm64/include/asm/kvm_hyptrace.h b/arch/arm64/include/asm= /kvm_hyptrace.h index 7da6a248c7fa..7b66bd06537f 100644 --- a/arch/arm64/include/asm/kvm_hyptrace.h +++ b/arch/arm64/include/asm/kvm_hyptrace.h @@ -4,6 +4,22 @@ #include =20 #include +#include +#include + +struct ht_iterator { + struct trace_buffer *trace_buffer; + int cpu; + struct hyp_entry_hdr *ent; + unsigned long lost_events; + int ent_cpu; + size_t ent_size; + u64 ts; + void *spare; + size_t copy_leftover; + struct trace_seq seq; + struct delayed_work poll_work; +}; =20 /* * Host donations to the hypervisor to store the struct hyp_buffer_page. diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 8f5422ed1b75..e60754cdbf33 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -134,6 +134,10 @@ KVM_NVHE_ALIAS(__hyp_bss_start); KVM_NVHE_ALIAS(__hyp_bss_end); KVM_NVHE_ALIAS(__hyp_rodata_start); KVM_NVHE_ALIAS(__hyp_rodata_end); +#ifdef CONFIG_TRACING +KVM_NVHE_ALIAS(__hyp_event_ids_start); +KVM_NVHE_ALIAS(__hyp_event_ids_end); +#endif =20 /* pKVM static key */ KVM_NVHE_ALIAS(kvm_protected_mode_initialized); diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.ld= s.S index 55a8e310ea12..96986c1f852c 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -13,12 +13,23 @@ *(__kvm_ex_table) \ __stop___kvm_ex_table =3D .; =20 +#ifdef CONFIG_TRACING +#define HYPERVISOR_EVENT_IDS \ + . =3D ALIGN(PAGE_SIZE); \ + __hyp_event_ids_start =3D .; \ + *(HYP_SECTION_NAME(.event_ids)) \ + __hyp_event_ids_end =3D .; +#else +#define HYPERVISOR_EVENT_IDS +#endif + #define HYPERVISOR_DATA_SECTIONS \ HYP_SECTION_NAME(.rodata) : { \ . =3D ALIGN(PAGE_SIZE); \ __hyp_rodata_start =3D .; \ *(HYP_SECTION_NAME(.data..ro_after_init)) \ *(HYP_SECTION_NAME(.rodata)) \ + HYPERVISOR_EVENT_IDS \ . =3D ALIGN(PAGE_SIZE); \ __hyp_rodata_end =3D .; \ } @@ -200,6 +211,13 @@ SECTIONS ASSERT(SIZEOF(.got.plt) =3D=3D 0 || SIZEOF(.got.plt) =3D=3D 0x18, "Unexpected GOT/PLT entries detected!") =20 +#ifdef CONFIG_TRACING + .rodata.hyp_events : { + __hyp_events_start =3D .; + *(_hyp_events) + __hyp_events_end =3D .; + } +#endif /* code sections that are never executed via the kernel mapping */ .rodata.text : { TRAMP_TEXT diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index c5bbf6b087a0..3b7dbd7f6824 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -28,7 +28,7 @@ kvm-y +=3D arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.= o \ kvm-$(CONFIG_HW_PERF_EVENTS) +=3D pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) +=3D pauth.o =20 -kvm-$(CONFIG_TRACING) +=3D hyp_trace.o +kvm-$(CONFIG_TRACING) +=3D hyp_events.o hyp_trace.o =20 always-y :=3D hyp_constants.h hyp-constants.s =20 diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 444719b44f7a..737aef39424b 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -2656,6 +2656,8 @@ static int __init init_hyp_mode(void) =20 kvm_hyp_init_symbols(); =20 + hyp_trace_init_events(); + if (is_protected_kvm_enabled()) { if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) && cpus_have_final_cap(ARM64_HAS_ADDRESS_AUTH)) diff --git a/arch/arm64/kvm/hyp/include/nvhe/arm-smccc.h b/arch/arm64/kvm/h= yp/include/nvhe/arm-smccc.h new file mode 100644 index 000000000000..4b69d33e4f2d --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/arm-smccc.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include + +#include + +#undef arm_smccc_1_1_smc +#define arm_smccc_1_1_smc(...) \ + do { \ + trace_hyp_exit(); \ + __arm_smccc_1_1(SMCCC_SMC_INST, __VA_ARGS__); \ + trace_hyp_enter(); \ + } while (0) diff --git a/arch/arm64/kvm/hyp/include/nvhe/define_events.h b/arch/arm64/k= vm/hyp/include/nvhe/define_events.h new file mode 100644 index 000000000000..3947c1e47ef4 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/define_events.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef HYP_EVENT_FILE +# define __HYP_EVENT_FILE +#else +# define __HYP_EVENT_FILE __stringify(HYP_EVENT_FILE) +#endif + +#undef HYP_EVENT +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + atomic_t __ro_after_init __name##_enabled =3D ATOMIC_INIT(0); \ + struct hyp_event_id hyp_event_id_##__name \ + __section(".hyp.event_ids") =3D { \ + .data =3D (void *)&__name##_enabled, \ + } + +#define HYP_EVENT_MULTI_READ +#include __HYP_EVENT_FILE +#undef HYP_EVENT_MULTI_READ + +#undef HYP_EVENT diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/i= nclude/nvhe/trace.h index 1004e1edf24f..8384801f88c0 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -2,6 +2,7 @@ #ifndef __ARM64_KVM_HYP_NVHE_TRACE_H #define __ARM64_KVM_HYP_NVHE_TRACE_H #include +#include =20 /* Internal struct that needs export for hyp-constants.c */ struct hyp_buffer_page { @@ -15,6 +16,24 @@ struct hyp_buffer_page { #ifdef CONFIG_TRACING void *tracing_reserve_entry(unsigned long length); void tracing_commit_entry(void); +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + HYP_EVENT_FORMAT(__name, __struct); \ + extern atomic_t __name##_enabled; \ + extern struct hyp_event_id hyp_event_id_##__name; \ + static inline void trace_##__name(__proto) \ + { \ + size_t length =3D sizeof(struct trace_hyp_format_##__name); \ + struct trace_hyp_format_##__name *__entry; \ + \ + if (!atomic_read(&__name##_enabled)) \ + return; \ + __entry =3D tracing_reserve_entry(length); \ + if (!__entry) \ + return; \ + __entry->hdr.id =3D hyp_event_id_##__name.id; \ + __assign \ + tracing_commit_entry(); \ + } =20 void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 ep= och_cyc); int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); @@ -22,9 +41,12 @@ void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); int __pkvm_reset_tracing(unsigned int cpu); int __pkvm_swap_reader_tracing(unsigned int cpu); +int __pkvm_enable_event(unsigned short id, bool enable); #else static inline void *tracing_reserve_entry(unsigned long length) { return N= ULL; } static inline void tracing_commit_entry(void) { } +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static inline void trace_##__name(__proto) {} =20 static inline void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 ep= och_cyc) { } @@ -33,5 +55,6 @@ static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } static inline int __pkvm_reset_tracing(unsigned int cpu) { return -ENODEV;= } static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -E= NODEV; } +static inline int __pkvm_enable_event(unsigned short id, bool enable) { r= eturn -ENODEV; } #endif #endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Mak= efile index 40f243c44cf5..fc11e47a1e90 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,7 +28,7 @@ hyp-obj-y :=3D timer-sr.o sysreg-sr.o debug-sr.o switch.o= tlb.o hyp-init.o host.o hyp-obj-y +=3D ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../en= try.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) +=3D list_debug.o -hyp-obj-$(CONFIG_TRACING) +=3D clock.o trace.o +hyp-obj-$(CONFIG_TRACING) +=3D clock.o events.o trace.o hyp-obj-y +=3D $(lib-objs) =20 ## diff --git a/arch/arm64/kvm/hyp/nvhe/events.c b/arch/arm64/kvm/hyp/nvhe/eve= nts.c new file mode 100644 index 000000000000..ad214f3f698c --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/events.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 Google LLC + */ + +#include +#include + +#include + +extern struct hyp_event_id __hyp_event_ids_start[]; +extern struct hyp_event_id __hyp_event_ids_end[]; + +int __pkvm_enable_event(unsigned short id, bool enable) +{ + struct hyp_event_id *event_id =3D __hyp_event_ids_start; + atomic_t *enable_key; + + for (; (unsigned long)event_id < (unsigned long)__hyp_event_ids_end; + event_id++) { + if (event_id->id !=3D id) + continue; + + enable_key =3D (atomic_t *)event_id->data; + enable_key =3D hyp_fixmap_map(__hyp_pa(enable_key)); + + atomic_set(enable_key, enable); + + hyp_fixmap_unmap(); + + return 0; + } + + return -EINVAL; +} diff --git a/arch/arm64/kvm/hyp/nvhe/ffa.c b/arch/arm64/kvm/hyp/nvhe/ffa.c index e715c157c2c4..d17ef3771e6a 100644 --- a/arch/arm64/kvm/hyp/nvhe/ffa.c +++ b/arch/arm64/kvm/hyp/nvhe/ffa.c @@ -26,10 +26,10 @@ * the duration and are therefore serialised. */ =20 -#include #include #include =20 +#include #include #include #include diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index dc7a85922117..f9983d4a8d4c 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -11,6 +11,7 @@ #include #include #include +#include #include =20 #include @@ -422,6 +423,14 @@ static void handle___pkvm_swap_reader_tracing(struct k= vm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) =3D __pkvm_swap_reader_tracing(cpu); } =20 +static void handle___pkvm_enable_event(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned short, id, host_ctxt, 1); + DECLARE_REG(bool, enable, host_ctxt, 2); + + cpu_reg(host_ctxt, 1) =3D __pkvm_enable_event(id, enable); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); =20 #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] =3D (hcall_t)handle_##x @@ -460,6 +469,7 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__pkvm_enable_tracing), HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), + HANDLE_FUNC(__pkvm_enable_event), }; =20 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) @@ -500,7 +510,9 @@ static void handle_host_hcall(struct kvm_cpu_context *h= ost_ctxt) =20 static void default_host_smc_handler(struct kvm_cpu_context *host_ctxt) { + trace_hyp_exit(); __kvm_hyp_host_forward_smc(host_ctxt); + trace_hyp_enter(); } =20 static void handle_host_smc(struct kvm_cpu_context *host_ctxt) @@ -524,6 +536,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) { u64 esr =3D read_sysreg_el2(SYS_ESR); =20 + trace_hyp_enter(); + switch (ESR_ELx_EC(esr)) { case ESR_ELx_EC_HVC64: handle_host_hcall(host_ctxt); @@ -543,4 +557,6 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) default: BUG(); } + + trace_hyp_exit(); } diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hy= p.lds.S index f4562f417d3f..9d0ce68f1ced 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S @@ -16,6 +16,10 @@ SECTIONS { HYP_SECTION(.text) HYP_SECTION(.data..ro_after_init) HYP_SECTION(.rodata) +#ifdef CONFIG_TRACING + . =3D ALIGN(PAGE_SIZE); + HYP_SECTION(.event_ids) +#endif =20 /* * .hyp..data..percpu needs to be page aligned to maintain the same diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe= /psci-relay.c index dfe8fe0f7eaf..1315fb6df3a3 100644 --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c @@ -6,11 +6,12 @@ =20 #include #include +#include #include -#include #include #include =20 +#include #include #include =20 @@ -153,6 +154,7 @@ static int psci_cpu_suspend(u64 func_id, struct kvm_cpu= _context *host_ctxt) DECLARE_REG(u64, power_state, host_ctxt, 1); DECLARE_REG(unsigned long, pc, host_ctxt, 2); DECLARE_REG(unsigned long, r0, host_ctxt, 3); + int ret; =20 struct psci_boot_args *boot_args; struct kvm_nvhe_init_params *init_params; @@ -171,9 +173,11 @@ static int psci_cpu_suspend(u64 func_id, struct kvm_cp= u_context *host_ctxt) * Will either return if shallow sleep state, or wake up into the entry * point if it is a deep sleep state. */ - return psci_call(func_id, power_state, - __hyp_pa(&kvm_hyp_cpu_resume), - __hyp_pa(init_params)); + ret =3D psci_call(func_id, power_state, + __hyp_pa(&kvm_hyp_cpu_resume), + __hyp_pa(init_params)); + + return ret; } =20 static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_c= txt) @@ -205,6 +209,7 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bo= ol is_cpu_on) struct psci_boot_args *boot_args; struct kvm_cpu_context *host_ctxt; =20 + trace_hyp_enter(); host_ctxt =3D host_data_ptr(host_ctxt); =20 if (is_cpu_on) @@ -218,6 +223,7 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bo= ol is_cpu_on) if (is_cpu_on) release_boot_args(boot_args); =20 + trace_hyp_exit(); __host_enter(host_ctxt); } =20 diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/swi= tch.c index 8f5c56d5b1cd..1604576d3975 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -7,7 +7,6 @@ #include #include =20 -#include #include #include #include @@ -21,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -327,10 +327,13 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) __debug_switch_to_guest(vcpu); =20 do { + trace_hyp_exit(); + /* Jump in the fire! */ exit_code =3D __guest_enter(vcpu); =20 /* And we're baaack! */ + trace_hyp_enter(); } while (fixup_guest_exit(vcpu, &exit_code)); =20 __sysreg_save_state_nvhe(guest_ctxt); diff --git a/arch/arm64/kvm/hyp_events.c b/arch/arm64/kvm/hyp_events.c new file mode 100644 index 000000000000..336c5e3e9b3f --- /dev/null +++ b/arch/arm64/kvm/hyp_events.c @@ -0,0 +1,165 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 Google LLC + */ + +#include + +#include +#include +#include + +#include "hyp_trace.h" + +extern struct hyp_event __hyp_events_start[]; +extern struct hyp_event __hyp_events_end[]; + +/* hyp_event section used by the hypervisor */ +extern struct hyp_event_id __hyp_event_ids_start[]; +extern struct hyp_event_id __hyp_event_ids_end[]; + +static ssize_t +hyp_event_write(struct file *filp, const char __user *ubuf, size_t cnt, lo= ff_t *ppos) +{ + struct seq_file *seq_file =3D (struct seq_file *)filp->private_data; + struct hyp_event *evt =3D (struct hyp_event *)seq_file->private; + unsigned short id =3D evt->id; + bool enabling; + int ret; + char c; + + if (!cnt || cnt > 2) + return -EINVAL; + + if (get_user(c, ubuf)) + return -EFAULT; + + switch (c) { + case '1': + enabling =3D true; + break; + case '0': + enabling =3D false; + break; + default: + return -EINVAL; + } + + if (enabling !=3D *evt->enabled) { + ret =3D kvm_call_hyp_nvhe(__pkvm_enable_event, id, enabling); + if (ret) + return ret; + } + + *evt->enabled =3D enabling; + + return cnt; +} + +static int hyp_event_show(struct seq_file *m, void *v) +{ + struct hyp_event *evt =3D (struct hyp_event *)m->private; + + seq_printf(m, "%d\n", *evt->enabled); + + return 0; +} + +static int hyp_event_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, hyp_event_show, inode->i_private); +} + +static const struct file_operations hyp_event_fops =3D { + .open =3D hyp_event_open, + .write =3D hyp_event_write, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D single_release, +}; + +static int hyp_event_id_show(struct seq_file *m, void *v) +{ + struct hyp_event *evt =3D (struct hyp_event *)m->private; + + seq_printf(m, "%d\n", evt->id); + + return 0; +} + +static int hyp_event_id_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, hyp_event_id_show, inode->i_private); +} + +static const struct file_operations hyp_event_id_fops =3D { + .open =3D hyp_event_id_open, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D single_release, +}; + +void hyp_trace_init_event_tracefs(struct dentry *parent) +{ + struct hyp_event *event =3D __hyp_events_start; + + parent =3D tracefs_create_dir("events", parent); + if (!parent) { + pr_err("Failed to create tracefs folder for hyp events\n"); + return; + } + + parent =3D tracefs_create_dir("hypervisor", parent); + if (!parent) { + pr_err("Failed to create tracefs folder for hyp events\n"); + return; + } + + for (; (unsigned long)event < (unsigned long)__hyp_events_end; event++) { + struct dentry *event_dir =3D tracefs_create_dir(event->name, parent); + + if (!event_dir) { + pr_err("Failed to create events/hypervisor/%s\n", + event->name); + continue; + } + + tracefs_create_file("enable", 0700, event_dir, (void *)event, + &hyp_event_fops); + tracefs_create_file("id", 0400, event_dir, (void *)event, + &hyp_event_id_fops); + } +} + +struct hyp_event *hyp_trace_find_event(int id) +{ + struct hyp_event *event =3D __hyp_events_start + id; + + if ((unsigned long)event >=3D (unsigned long)__hyp_events_end) + return NULL; + + return event; +} + +/* + * Register hyp events and write their id into the hyp section _hyp_event_= ids. + */ +int hyp_trace_init_events(void) +{ + struct hyp_event_id *hyp_event_id =3D __hyp_event_ids_start; + struct hyp_event *event =3D __hyp_events_start; + int id =3D 0; + + for (; (unsigned long)event < (unsigned long)__hyp_events_end; + event++, hyp_event_id++, id++) { + + /* + * Both the host and the hypervisor relies on the same hyp event + * declarations from kvm_hypevents.h. We have then a 1:1 + * mapping. + */ + event->id =3D hyp_event_id->id =3D id; + } + + return 0; +} diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 8ac8f9763cbd..292f7abc23f4 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -6,10 +6,12 @@ =20 #include #include +#include #include =20 #include #include +#include =20 #include "hyp_constants.h" #include "hyp_trace.h" @@ -560,6 +562,8 @@ static void ht_print_trace_cpu(struct ht_iterator *iter) =20 static int ht_print_trace_fmt(struct ht_iterator *iter) { + struct hyp_event *e; + if (iter->lost_events) trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", iter->ent_cpu, iter->lost_events); @@ -567,6 +571,12 @@ static int ht_print_trace_fmt(struct ht_iterator *iter) ht_print_trace_cpu(iter); ht_print_trace_time(iter); =20 + e =3D hyp_trace_find_event(iter->ent->id); + if (e) + e->trace_func(iter); + else + trace_seq_printf(&iter->seq, "Unknown event id %d\n", iter->ent->id); + return trace_seq_has_overflowed(&iter->seq) ? -EOVERFLOW : 0; }; =20 @@ -934,5 +944,7 @@ int hyp_trace_init_tracefs(void) (void *)cpu, &hyp_trace_fops); } =20 + hyp_trace_init_event_tracefs(root); + return 0; } diff --git a/arch/arm64/kvm/hyp_trace.h b/arch/arm64/kvm/hyp_trace.h index 14fc06c625a6..3ac648415bf9 100644 --- a/arch/arm64/kvm/hyp_trace.h +++ b/arch/arm64/kvm/hyp_trace.h @@ -3,26 +3,13 @@ #ifndef __ARM64_KVM_HYP_TRACE_H__ #define __ARM64_KVM_HYP_TRACE_H__ =20 -#include -#include - -struct ht_iterator { - struct trace_buffer *trace_buffer; - int cpu; - struct hyp_entry_hdr *ent; - unsigned long lost_events; - int ent_cpu; - size_t ent_size; - u64 ts; - void *spare; - size_t copy_leftover; - struct trace_seq seq; - struct delayed_work poll_work; -}; - #ifdef CONFIG_TRACING int hyp_trace_init_tracefs(void); +int hyp_trace_init_events(void); +struct hyp_event *hyp_trace_find_event(int id); +void hyp_trace_init_event_tracefs(struct dentry *parent); #else static inline int hyp_trace_init_tracefs(void) { return 0; } +static inline int hyp_trace_init_events(void) { return 0; } #endif #endif --=20 2.46.0.598.g6f2099f65c-goog From nobody Sat Nov 30 05:43:21 2024 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5FF7D1885BE for ; Wed, 11 Sep 2024 09:31:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047079; cv=none; b=ZcMcHbTqRW3Zz7kHKAUpNo20XSEc/blcCBYy8gImAv8NKb5sI5GC+q6yLvYt+ON5nBJkceDZ/TVrLu7aTNgn6/d12MFgvbdBGbq+/iFsbI6+bnBRjhNDc54aVfUc0sLj1lTbeIVkSaFrihn07FC5MezEbuI+Kpfw8qRJpwdvtqA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726047079; c=relaxed/simple; bh=nHH6U/9NK1NGegXj6WlTrXArnD2Ybi4HhEFmOD+lSEk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RHgbCQR1I/LLVrkWqKVddg0EL2VELA8cgMAXY/92+RtES8idimhq29WzwFRlI9v3hAJzSf1s/FpavDOrv3xzKDav6LL9IVo10mTaFc8Ih0DHfsvTlSkkZt83JVLL1KHhSx/YGz0UGJ6JNuWybYnZRL7KCOVt/vUHVDqW9lFJpVc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=B/OXiipU; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="B/OXiipU" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-374cd315c68so3810176f8f.0 for ; Wed, 11 Sep 2024 02:31:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726047075; x=1726651875; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=R0ql8FhOpfg6VpbbTG/ZwJ1DV871rVa0x5nxiQMVov8=; b=B/OXiipUPAKeEVrvcICnuN3go+I6xAYnwGhOrpSLXuKVilgW5kxhGHdx8NX/68FwkN 0gEYtqK7JkYyZ786r5gEyUjBGjZvpOQXaEprBQqZ69nbGzip6ogLjb6WXeAb45+9YNRd hnGvYdI6WCJL5l2EEWUU0QjmBg9dLX+/AYz/2PsRK1f9j3CX4+lzIvJ21ND7ERKa8xOs PvkGjfMZhpfwvJjtoDx/79OzUUe/in3az2NPaA8T01Bu+3fVR84+h0GPY8FZ9c3CEzw5 X1iUGreDPQxx/KnxL7BaavYxMt1mrCBKVGNVqGorSeANN+/ZgkSFT4487gu4DzhoZMHO odAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726047075; x=1726651875; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=R0ql8FhOpfg6VpbbTG/ZwJ1DV871rVa0x5nxiQMVov8=; b=PGynvnXAuCz8onufKA0WlJeoBI4JAampXwqKdyXvMgKxdSntiVhdRHC6Id40vaa4nO b9OHT49DMeSQhi7X0iBOyOp+/4VjJpXAHq1xJNiX/maUFuyUyqUlbJSjAvyAsqR29w8t omzPS5oXQ9jTW2zWHOPcCtguubjw/zWXESnL7hatRQN4uwij3sAlrQs2RlRZ/tnMKAu9 RFuJV3WO+fjcSTiQZOiSzGsaUv2AXM6Jz5gCNIZUeR6bDbSNXnGWo+JVV5WbFyznp7E8 6/LJqMNTqVxTBoJW8hOxU3LJ/mCbSKZORyzDiURAuDmgYIfX3vXerc2WS0VqdLxDMuqA 1dIw== X-Forwarded-Encrypted: i=1; AJvYcCXY+CYeIP8rHjNIS+m2Za8zOj6JtKrnfLYaTYXdkGjjR2+dAzyYBzm2Y70L587ob18pB59sFNkxSm/P93E=@vger.kernel.org X-Gm-Message-State: AOJu0Yza9VIApjGYQzbK4PZNjjNpfTUgkbTi6UyDjEEvhljhhnC8HTgg qQ7A/BNnVxwcJd89mHgcS0BZNPAov6c3JvhFlYwv/rygv9NoRQyOEYhyVJQNPcFAZOxfk0SSU/L XICpnWbfS9P6xsHN3yA== X-Google-Smtp-Source: AGHT+IHT7JcnsHOxG+44WSeWAc4200KmqhzTuuadsLCqKhH/XiIelY9/9nsYaSZQ0k9teOOJUozUOlVOw5AEwKK6 X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:adf:ec8c:0:b0:374:bf86:57a2 with SMTP id ffacd0b85a97d-378896eab80mr20744f8f.11.1726047074846; Wed, 11 Sep 2024 02:31:14 -0700 (PDT) Date: Wed, 11 Sep 2024 10:30:29 +0100 In-Reply-To: <20240911093029.3279154-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240911093029.3279154-1-vdonnefort@google.com> X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <20240911093029.3279154-14-vdonnefort@google.com> Subject: [PATCH 13/13] KVM: arm64: Add kselftest for tracefs hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev Cc: kvmarm@lists.linux.dev, will@kernel.org, qperret@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test to validate the newly introduced tracefs interface for the pKVM hypervisor. This test covers the usage of extended timestamp and coherence of the tracing clock. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index 3710deb6eaa0..be7d4d2434e7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -86,6 +86,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_event, + __KVM_HOST_SMCCC_FUNC___pkvm_selftest_event, }; =20 #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_hypevents.h b/arch/arm64/include/as= m/kvm_hypevents.h index 0b98a87a1250..1c797b748ff2 100644 --- a/arch/arm64/include/asm/kvm_hypevents.h +++ b/arch/arm64/include/asm/kvm_hypevents.h @@ -28,4 +28,14 @@ HYP_EVENT(hyp_exit, ), HE_PRINTK(" ") ); + +#ifdef CONFIG_PROTECTED_NVHE_TESTING +HYP_EVENT(selftest, + HE_PROTO(void), + HE_STRUCT(), + HE_ASSIGN(), + HE_PRINTK(" ") +); #endif + +#endif /* __ARM64_KVM_HYPEVENTS_H_ */ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 8304eb342be9..c7ae07a88875 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -66,4 +66,13 @@ config PROTECTED_NVHE_STACKTRACE =20 If unsure, or not using protected nVHE (pKVM), say N. =20 +config PROTECTED_NVHE_TESTING + bool "Protected KVM hypervisor testing infrastructure" + depends on KVM + default n + help + Say Y here to enable pKVM hypervisor testing infrastructure. + + If unsure, say N. + endif # VIRTUALIZATION diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index f9983d4a8d4c..2c040585fdd2 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -431,6 +431,19 @@ static void handle___pkvm_enable_event(struct kvm_cpu_= context *host_ctxt) cpu_reg(host_ctxt, 1) =3D __pkvm_enable_event(id, enable); } =20 +static void handle___pkvm_selftest_event(struct kvm_cpu_context *host_ctxt) +{ + int smc_ret =3D SMCCC_RET_NOT_SUPPORTED, ret =3D -EOPNOTSUPP; + +#ifdef CONFIG_PROTECTED_NVHE_TESTING + trace_selftest(); + smc_ret =3D SMCCC_RET_SUCCESS; + ret =3D 0; +#endif + cpu_reg(host_ctxt, 0) =3D smc_ret; + cpu_reg(host_ctxt, 1) =3D ret; +} + typedef void (*hcall_t)(struct kvm_cpu_context *); =20 #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] =3D (hcall_t)handle_##x @@ -470,6 +483,7 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), HANDLE_FUNC(__pkvm_enable_event), + HANDLE_FUNC(__pkvm_selftest_event), }; =20 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 292f7abc23f4..356ce3042936 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -887,6 +887,36 @@ static int hyp_trace_clock_show(struct seq_file *m, vo= id *v) } DEFINE_SHOW_ATTRIBUTE(hyp_trace_clock); =20 +#ifdef CONFIG_PROTECTED_NVHE_TESTING +static int selftest_event_open(struct inode *inode, struct file *file) +{ + if (file->f_mode & FMODE_WRITE) + return kvm_call_hyp_nvhe(__pkvm_selftest_event); + + return 0; +} + +static ssize_t selftest_event_write(struct file *f, const char __user *buf, + size_t cnt, loff_t *pos) +{ + return cnt; +} + +static const struct file_operations selftest_event_fops =3D { + .open =3D selftest_event_open, + .write =3D selftest_event_write, + .llseek =3D no_llseek, +}; + +static void hyp_trace_init_testing_tracefs(struct dentry *root) +{ + tracefs_create_file("selftest_event", TRACEFS_MODE_WRITE, root, NULL, + &selftest_event_fops); +} +#else +static void hyp_trace_init_testing_tracefs(struct dentry *root) { } +#endif + int hyp_trace_init_tracefs(void) { struct dentry *root, *per_cpu_root; @@ -945,6 +975,7 @@ int hyp_trace_init_tracefs(void) } =20 hyp_trace_init_event_tracefs(root); + hyp_trace_init_testing_tracefs(root); =20 return 0; } diff --git a/tools/testing/selftests/hyp-trace/Makefile b/tools/testing/sel= ftests/hyp-trace/Makefile new file mode 100644 index 000000000000..2a5b2e29667e --- /dev/null +++ b/tools/testing/selftests/hyp-trace/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +all: + +TEST_PROGS :=3D hyp-trace-test + +include ../lib.mk diff --git a/tools/testing/selftests/hyp-trace/config b/tools/testing/selft= ests/hyp-trace/config new file mode 100644 index 000000000000..39cee8ec30fa --- /dev/null +++ b/tools/testing/selftests/hyp-trace/config @@ -0,0 +1,4 @@ +CONFIG_FTRACE=3Dy +CONFIG_ARM64=3Dy +CONFIG_KVM=3Dy +CONFIG_PROTECTED_NVHE_TESTING=3Dy diff --git a/tools/testing/selftests/hyp-trace/hyp-trace-test b/tools/testi= ng/selftests/hyp-trace/hyp-trace-test new file mode 100755 index 000000000000..868eb81bfb77 --- /dev/null +++ b/tools/testing/selftests/hyp-trace/hyp-trace-test @@ -0,0 +1,254 @@ +#!/bin/sh -e +# SPDX-License-Identifier: GPL-2.0-only + +# hyp-trace-test - Tracefs for pKVM hypervisor test +# +# Copyright (C) 2024 - Google LLC +# Author: Vincent Donnefort +# + +log_and_die() +{ + echo "$1" + + exit 1 +} + +host_clock() +{ + # BOOTTIME clock + awk '/now/ { printf "%.6f\n", $3 / 1000000000 }' /proc/timer_list +} + +page_size() +{ + echo "$(awk '/KernelPageSize/ {print $2; exit}' /proc/self/smaps) * 10= 24" | bc +} + +goto_hyp_trace() +{ + if [ -d "/sys/kernel/debug/tracing/hypervisor" ]; then + cd /sys/kernel/debug/tracing/hypervisor + return + fi + + if [ -d "/sys/kernel/tracing/hypervisor" ]; then + cd /sys/kernel/tracing/hypervisor + return + fi + + echo "ERROR: hyp tracing folder not found!" + + exit 1 +} + +reset_hyp_trace() +{ + echo 0 > tracing_on + echo 0 > trace + for event in events/hypervisor/*; do + echo 0 > $event/enable + done +} + +setup_hyp_trace() +{ + reset_hyp_trace + + echo 16 > buffer_size_kb + echo 1 > events/hypervisor/selftest/enable + echo 1 > tracing_on +} + +stop_hyp_trace() +{ + echo 0 > tracing_on +} + +hyp_trace_loaded() +{ + grep -q "(loaded)" buffer_size_kb +} + +write_events() +{ + local num=3D"$1" + local func=3D"$2" + + for i in $(seq 1 $num); do + echo 1 > selftest_event + [ -z "$func" -o $i -eq $num ] || eval $func + done +} + +consuming_read() +{ + local output=3D$1 + + cat trace_pipe > $output & + + echo $! +} + +run_test_consuming() +{ + local nr_events=3D$1 + local func=3D$2 + local tmp=3D"$(mktemp)" + local start_ts=3D0 + local end_ts=3D0 + local pid=3D0 + + echo "Output trace file: $tmp" + + setup_hyp_trace + pid=3D$(consuming_read $tmp) + + start_ts=3D$(host_clock) + write_events $nr_events $func + stop_hyp_trace + end_ts=3D$(host_clock) + + kill $pid + validate_test $tmp $nr_events $start_ts $end_ts + + rm $tmp +} + +validate_test() +{ + local output=3D$1 + local expected_events=3D$2 + local start_ts=3D$3 + local end_ts=3D$4 + local prev_ts=3D$3 + local ts=3D0 + local num_events=3D0 + + IFS=3D$'\n' + for line in $(cat $output); do + echo "$line" | grep -q -E "^# " && continue + ts=3D$(echo "$line" | awk '{print $2}' | cut -d ':' -f1) + if [ $(echo "$ts<$prev_ts" | bc) -eq 1 ]; then + log_and_die "Error event @$ts < $prev_ts" + fi + prev_ts=3D$ts + num_events=3D$((num_events + 1)) + done + + if [ $(echo "$ts>$end_ts" | bc) -eq 1 ]; then + log_and_die "Error event @$ts > $end_ts" + fi + + if [ $num_events -ne $expected_events ]; then + log_and_die "Expected $expected_events events, got $num_events" + fi +} + +test_ts() +{ + echo "Test Timestamps..." + + run_test_consuming 1000 + + echo "done." +} + +test_extended_ts() +{ + echo "Test Extended Timestamps..." + + run_test_consuming 1000 "sleep 0.1" + + echo "done." +} + +assert_loaded() +{ + hyp_trace_loaded || log_and_die "Expected loaded buffer" +} + +assert_unloaded() +{ + ! hyp_trace_loaded || log_and_die "Expected unloaded buffer" +} + +test_unloading() +{ + local tmp=3D"$(mktemp)" + + echo "Test unloading..." + + setup_hyp_trace + assert_loaded + + echo 0 > tracing_on + assert_unloaded + + pid=3D$(consuming_read $tmp) + sleep 1 + assert_loaded + kill $pid + assert_unloaded + + echo 1 > tracing_on + write_events 1 + echo 0 > trace + assert_loaded + echo 0 > tracing_on + assert_unloaded + + echo "done." +} + +test_reset() +{ + local tmp=3D"$(mktemp)" + + echo "Test Reset..." + + setup_hyp_trace + write_events 5 + echo 0 > trace + write_events 5 + + pid=3D$(consuming_read $tmp) + sleep 1 + stop_hyp_trace + kill $pid + + validate_test $tmp 5 0 $(host_clock) + + rm $tmp + + echo "done." +} + +test_big_bpacking() +{ + local hyp_buffer_page_size=3D48 + local page_size=3D$(page_size) + local min_buf_size=3D$(echo "$page_size * $page_size / ($hyp_buffer_pa= ge_size * $(nproc))" | bc) + + min_buf_size=3D$(echo "$min_buf_size * 2 / 1024" | bc) + + echo "Test loading $min_buf_size kB buffer..." + + reset_hyp_trace + echo $min_buf_size > buffer_size_kb + echo 1 > tracing_on + + stop_hyp_trace + + echo "done." +} + +goto_hyp_trace + +test_reset +test_unloading +test_big_bpacking +test_ts +test_extended_ts + +exit 0 --=20 2.46.0.598.g6f2099f65c-goog