From nobody Sun Oct 5 03:40:35 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CB861C3C14 for ; Mon, 11 Aug 2025 10:10:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754907011; cv=none; b=MMq4/8R+NR7BCibrrd/ulyoWeYltf6UiWkJW/WVFeO0XFBtWD1Ey1gu+o3GXGmgnDemgzmnpXdkHXHpidZlA0vc9Xpn5YJVqXh+lhhPuvd5VaYQeKTENBraxcH2e0LMWk2Pk8m1wtwiHU2cUjqnZMPoAXL+S2QlU36O+Nf30/Ec= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754907011; c=relaxed/simple; bh=938H2CFURsbr0s0/uVk468+1SsLIfHSCAH7pP2aUps4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qkjz7iC5nkvZJTTcAaiztobCtT0LArZFuLNy+JiGQXs1OUzlmq++Rl2d6OoQ5ksHx8TLDOmWiBdoNjphz6xkE3T3VgNh3bUOvVCYhSgDVwa3BlLsMrR8pWrveFVRegRCu2jsNxoVBLk5vnl42kFzza7Cl2PqkW2v3WB4wzXoMxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EHs1bHMq; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EHs1bHMq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1754907008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pzNdHcQoAbjyzhflSBM22mFHbw/aig0BYZserUsQZa0=; b=EHs1bHMq5WCPxia3YxXwWhsTpp07FkJHYc3h/3eno3u+QDKFo33jyDnrohY1d1CzSbxkUZ 0pm0cpejdwzB+S3SgAqXHIRKpIEmES062OEcsFYHEb4nr5Zoi/1hY7vZ+nvm6UA5y2+Wlx al3CYiK/3NDw+j6ZYSbyzTySSabTewM= Received: from mail-ot1-f70.google.com (mail-ot1-f70.google.com [209.85.210.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-386-2q_8J7y0Mru4AB265ft-rQ-1; Mon, 11 Aug 2025 06:10:07 -0400 X-MC-Unique: 2q_8J7y0Mru4AB265ft-rQ-1 X-Mimecast-MFC-AGG-ID: 2q_8J7y0Mru4AB265ft-rQ_1754907006 Received: by mail-ot1-f70.google.com with SMTP id 46e09a7af769-72e2f2b9ea1so5491695a34.3 for ; Mon, 11 Aug 2025 03:10:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754907006; x=1755511806; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pzNdHcQoAbjyzhflSBM22mFHbw/aig0BYZserUsQZa0=; b=aq+1CH+LQITZBfVgm956D896xIEYML8cBb4qiMoJ2E2YIdL/cYdvB6US0HDk0te2UT y8SpXiNzoMPhBFPRrvQNBQfwTUw9tv/UPQHcUeLA0pKadW10ceT7C63+c1bjquoQ4h6R YuufsUvdXfqVHGLY++vpnQZJ4Z85j1DW1bud2x5FmyKYX8e2xcb6v/JgOnet9TUX/4l6 poCPm315o4BRL0k6TVfDqrJJy61GemZQmXct0NPbJeaUyH/wCk/y27OWEFeFnVxOOlHV EBTTU8kQ+pP9Jz6hs+r0u5/Ce5nlhotxIs/jx7g8Vqzjl2HlmYFhetH94jGFy5LLThFm 21gw== X-Gm-Message-State: AOJu0YyT3KJ7MmD3NHrA/Y7aEl4g99QGIyHhzDZKJUrdxy/Tc8q2I0pb xQ3Rj4YBbEMm06peEEdIYuW0JPFMX49ifpU92/GaJ+PYV5hBTZHXqdAsyYFCDOYVIfXUb/KUYw7 lC91IE6b+OsjIF3SbPo14beGc1iy19UmZtNdXBP1puX/HIFzjfVpJICYDIGp4nLt4Bw== X-Gm-Gg: ASbGncsMqMttOZhZEP15+q+q4h/TT4eLjgABQ/CEtJ3iepyu0C4FiAGt3loC3Vtg1E+ TdHAq652sQ2vTtvaOG8Kt3Fu1SJ5PQBNTLbY7+rMP+H702lCtYpHYAsS2tkAESYA7dDO5tHRbBw RmDwkAu8kPrEFx/R4P4oHFqXRxIsqaaIQslTOwrx9Wo1yRVh6P3MmHilwltZ6m8HdHpHRN5b+O/ IUPN0dHSgAw+e2s2sqlq0pnAH/tGmEtccWITmL/s1IJs3xPSFql4No+WHYpt9uVCdmwL79EdCRS qzR7iM69qW52x3iHm5czSLwRcBYuadseJ0WMe4BpZ/dR/f+ybOk+l8CjL8RyP4FaFRriMZQfpA= = X-Received: by 2002:a05:6830:4424:b0:743:48c:5ea with SMTP id 46e09a7af769-7432c8ce6cbmr8666586a34.23.1754907006023; Mon, 11 Aug 2025 03:10:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFPMkeIuaw5ut7jPZnictoryGgc/pDcmSAfjhAGeLoljOB09UrLqjjQVfnfrD5QQ8/QsoNtTw== X-Received: by 2002:a05:6830:4424:b0:743:48c:5ea with SMTP id 46e09a7af769-7432c8ce6cbmr8666572a34.23.1754907005566; Mon, 11 Aug 2025 03:10:05 -0700 (PDT) Received: from cluster.. (4f.55.790d.ip4.static.sl-reverse.com. [13.121.85.79]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-61b84cb7630sm849522eaf.19.2025.08.11.03.10.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Aug 2025 03:10:05 -0700 (PDT) From: Alex Markuze To: ceph-devel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Slava.Dubeyko@ibm.com, idryomov@gmail.com, Alex Markuze Subject: [PATCH v3 1/2] ceph: fix client race condition validating r_parent before applying state Date: Mon, 11 Aug 2025 10:09:52 +0000 Message-Id: <20250811100953.3103970-2-amarkuze@redhat.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250811100953.3103970-1-amarkuze@redhat.com> References: <20250811100953.3103970-1-amarkuze@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add validation to ensure the cached parent directory inode matches the directory info in MDS replies. This prevents client-side race conditions where concurrent operations (e.g. rename) cause r_parent to become stale between request initiation and reply processing, which could lead to applying state changes to incorrect directory inodes. --- fs/ceph/debugfs.c | 14 ++-- fs/ceph/dir.c | 17 ++--- fs/ceph/file.c | 24 +++---- fs/ceph/inode.c | 7 +- fs/ceph/mds_client.c | 148 +++++++++++++++++++++++++++---------------- fs/ceph/mds_client.h | 12 +++- 6 files changed, 132 insertions(+), 90 deletions(-) diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index fdd404fc8112..4fe5a0266891 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -55,8 +55,6 @@ static int mdsc_show(struct seq_file *s, void *p) struct ceph_mds_client *mdsc =3D fsc->mdsc; struct ceph_mds_request *req; struct rb_node *rp; - int pathlen =3D 0; - u64 pathbase; char *path; =20 mutex_lock(&mdsc->mutex); @@ -81,8 +79,8 @@ static int mdsc_show(struct seq_file *s, void *p) if (req->r_inode) { seq_printf(s, " #%llx", ceph_ino(req->r_inode)); } else if (req->r_dentry) { - path =3D ceph_mdsc_build_path(mdsc, req->r_dentry, &pathlen, - &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, req->r_dentry, &path_info, 0); if (IS_ERR(path)) path =3D NULL; spin_lock(&req->r_dentry->d_lock); @@ -91,7 +89,7 @@ static int mdsc_show(struct seq_file *s, void *p) req->r_dentry, path ? path : ""); spin_unlock(&req->r_dentry->d_lock); - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); } else if (req->r_path1) { seq_printf(s, " #%llx/%s", req->r_ino1.ino, req->r_path1); @@ -100,8 +98,8 @@ static int mdsc_show(struct seq_file *s, void *p) } =20 if (req->r_old_dentry) { - path =3D ceph_mdsc_build_path(mdsc, req->r_old_dentry, &pathlen, - &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, req->r_old_dentry, &path_info, 0); if (IS_ERR(path)) path =3D NULL; spin_lock(&req->r_old_dentry->d_lock); @@ -111,7 +109,7 @@ static int mdsc_show(struct seq_file *s, void *p) req->r_old_dentry, path ? path : ""); spin_unlock(&req->r_old_dentry->d_lock); - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); } else if (req->r_path2 && req->r_op !=3D CEPH_MDS_OP_SYMLINK) { if (req->r_ino2.ino) seq_printf(s, " #%llx/%s", req->r_ino2.ino, diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 1535b6540e9d..3b4ec6219b6d 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1271,10 +1271,8 @@ static void ceph_async_unlink_cb(struct ceph_mds_cli= ent *mdsc, =20 /* If op failed, mark everyone involved for errors */ if (result) { - int pathlen =3D 0; - u64 base =3D 0; - char *path =3D ceph_mdsc_build_path(mdsc, dentry, &pathlen, - &base, 0); + struct ceph_path_info path_info =3D {0}; + char *path =3D ceph_mdsc_build_path(mdsc, dentry, &path_info, 0); =20 /* mark error on parent + clear complete */ mapping_set_error(req->r_parent->i_mapping, result); @@ -1288,8 +1286,8 @@ static void ceph_async_unlink_cb(struct ceph_mds_clie= nt *mdsc, mapping_set_error(req->r_old_inode->i_mapping, result); =20 pr_warn_client(cl, "failure path=3D(%llx)%s result=3D%d!\n", - base, IS_ERR(path) ? "<>" : path, result); - ceph_mdsc_free_path(path, pathlen); + path_info.vino.ino, IS_ERR(path) ? "<>" : path, result); + ceph_mdsc_free_path(path, path_info.pathlen); } out: iput(req->r_old_inode); @@ -1347,8 +1345,6 @@ static int ceph_unlink(struct inode *dir, struct dent= ry *dentry) int err =3D -EROFS; int op; char *path; - int pathlen; - u64 pathbase; =20 if (ceph_snap(dir) =3D=3D CEPH_SNAPDIR) { /* rmdir .snap/foo is RMSNAP */ @@ -1367,14 +1363,15 @@ static int ceph_unlink(struct inode *dir, struct de= ntry *dentry) if (!dn) { try_async =3D false; } else { - path =3D ceph_mdsc_build_path(mdsc, dn, &pathlen, &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, dn, &path_info, 0); if (IS_ERR(path)) { try_async =3D false; err =3D 0; } else { err =3D ceph_mds_check_access(mdsc, path, MAY_WRITE); } - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); dput(dn); =20 /* For none EACCES cases will let the MDS do the mds auth check */ diff --git a/fs/ceph/file.c b/fs/ceph/file.c index c92ef1fc5b8a..e38b6927f50f 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -368,8 +368,6 @@ int ceph_open(struct inode *inode, struct file *file) int flags, fmode, wanted; struct dentry *dentry; char *path; - int pathlen; - u64 pathbase; bool do_sync =3D false; int mask =3D MAY_READ; =20 @@ -399,14 +397,15 @@ int ceph_open(struct inode *inode, struct file *file) if (!dentry) { do_sync =3D true; } else { - path =3D ceph_mdsc_build_path(mdsc, dentry, &pathlen, &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, dentry, &path_info, 0); if (IS_ERR(path)) { do_sync =3D true; err =3D 0; } else { err =3D ceph_mds_check_access(mdsc, path, mask); } - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); dput(dentry); =20 /* For none EACCES cases will let the MDS do the mds auth check */ @@ -613,15 +612,13 @@ static void ceph_async_create_cb(struct ceph_mds_clie= nt *mdsc, mapping_set_error(req->r_parent->i_mapping, result); =20 if (result) { - int pathlen =3D 0; - u64 base =3D 0; - char *path =3D ceph_mdsc_build_path(mdsc, req->r_dentry, &pathlen, - &base, 0); + struct ceph_path_info path_info =3D {0}; + char *path =3D ceph_mdsc_build_path(mdsc, req->r_dentry, &path_info, 0); =20 pr_warn_client(cl, "async create failure path=3D(%llx)%s result=3D%d!\n", - base, IS_ERR(path) ? "<>" : path, result); - ceph_mdsc_free_path(path, pathlen); + path_info.vino.ino, IS_ERR(path) ? "<>" : path, result); + ceph_mdsc_free_path(path, path_info.pathlen); =20 ceph_dir_clear_complete(req->r_parent); if (!d_unhashed(dentry)) @@ -789,8 +786,6 @@ int ceph_atomic_open(struct inode *dir, struct dentry *= dentry, int mask; int err; char *path; - int pathlen; - u64 pathbase; =20 doutc(cl, "%p %llx.%llx dentry %p '%pd' %s flags %d mode 0%o\n", dir, ceph_vinop(dir), dentry, dentry, @@ -812,7 +807,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *= dentry, if (!dn) { try_async =3D false; } else { - path =3D ceph_mdsc_build_path(mdsc, dn, &pathlen, &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, dn, &path_info, 0); if (IS_ERR(path)) { try_async =3D false; err =3D 0; @@ -824,7 +820,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *= dentry, mask |=3D MAY_WRITE; err =3D ceph_mds_check_access(mdsc, path, mask); } - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); dput(dn); =20 /* For none EACCES cases will let the MDS do the mds auth check */ diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 4365d17ce6d7..58da0f6419f7 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2502,22 +2502,21 @@ int __ceph_setattr(struct mnt_idmap *idmap, struct = inode *inode, int truncate_retry =3D 20; /* The RMW will take around 50ms */ struct dentry *dentry; char *path; - int pathlen; - u64 pathbase; bool do_sync =3D false; =20 dentry =3D d_find_alias(inode); if (!dentry) { do_sync =3D true; } else { - path =3D ceph_mdsc_build_path(mdsc, dentry, &pathlen, &pathbase, 0); + struct ceph_path_info path_info; + path =3D ceph_mdsc_build_path(mdsc, dentry, &path_info, 0); if (IS_ERR(path)) { do_sync =3D true; err =3D 0; } else { err =3D ceph_mds_check_access(mdsc, path, MAY_WRITE); } - ceph_mdsc_free_path(path, pathlen); + ceph_mdsc_free_path(path, path_info.pathlen); dput(dentry); =20 /* For none EACCES cases will let the MDS do the mds auth check */ diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 9eed6d73a508..952eda931bc1 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2729,7 +2729,7 @@ static u8 *get_fscrypt_altname(const struct ceph_mds_= request *req, u32 *plen) * foo/.snap/bar -> foo//bar */ char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, struct dentry *de= ntry, - int *plen, u64 *pbase, int for_wire) + struct ceph_path_info *path_info, int for_wire) { struct ceph_client *cl =3D mdsc->fsc->client; struct dentry *cur; @@ -2839,16 +2839,30 @@ char *ceph_mdsc_build_path(struct ceph_mds_client *= mdsc, struct dentry *dentry, return ERR_PTR(-ENAMETOOLONG); } =20 - *pbase =3D base; - *plen =3D PATH_MAX - 1 - pos; + /* Initialize the output structure */ + memset(path_info, 0, sizeof(*path_info)); + + path_info->vino.ino =3D base; + path_info->pathlen =3D PATH_MAX - 1 - pos; + path_info->path =3D path + pos; + path_info->freepath =3D true; + + /* Set snap from dentry if available */ + if (d_inode(dentry)) + path_info->vino.snap =3D ceph_snap(d_inode(dentry)); + else + path_info->vino.snap =3D CEPH_NOSNAP; + doutc(cl, "on %p %d built %llx '%.*s'\n", dentry, d_count(dentry), - base, *plen, path + pos); + base, PATH_MAX - 1 - pos, path + pos); return path + pos; } =20 + + static int build_dentry_path(struct ceph_mds_client *mdsc, struct dentry *= dentry, - struct inode *dir, const char **ppath, int *ppathlen, - u64 *pino, bool *pfreepath, bool parent_locked) + struct inode *dir, struct ceph_path_info *path_info, + bool parent_locked) { char *path; =20 @@ -2857,41 +2871,47 @@ static int build_dentry_path(struct ceph_mds_client= *mdsc, struct dentry *dentry dir =3D d_inode_rcu(dentry->d_parent); if (dir && parent_locked && ceph_snap(dir) =3D=3D CEPH_NOSNAP && !IS_ENCRYPTED(dir)) { - *pino =3D ceph_ino(dir); + path_info->vino.ino =3D ceph_ino(dir); + path_info->vino.snap =3D ceph_snap(dir); rcu_read_unlock(); - *ppath =3D dentry->d_name.name; - *ppathlen =3D dentry->d_name.len; + path_info->path =3D dentry->d_name.name; + path_info->pathlen =3D dentry->d_name.len; + path_info->freepath =3D false; return 0; } rcu_read_unlock(); - path =3D ceph_mdsc_build_path(mdsc, dentry, ppathlen, pino, 1); + path =3D ceph_mdsc_build_path(mdsc, dentry, path_info, 1); if (IS_ERR(path)) return PTR_ERR(path); - *ppath =3D path; - *pfreepath =3D true; + /* + * ceph_mdsc_build_path already fills path_info, including snap handling. + */ return 0; } =20 -static int build_inode_path(struct inode *inode, - const char **ppath, int *ppathlen, u64 *pino, - bool *pfreepath) +static int build_inode_path(struct inode *inode, struct ceph_path_info *pa= th_info) { struct ceph_mds_client *mdsc =3D ceph_sb_to_mdsc(inode->i_sb); struct dentry *dentry; char *path; =20 if (ceph_snap(inode) =3D=3D CEPH_NOSNAP) { - *pino =3D ceph_ino(inode); - *ppathlen =3D 0; + path_info->vino.ino =3D ceph_ino(inode); + path_info->vino.snap =3D ceph_snap(inode); + path_info->pathlen =3D 0; + path_info->freepath =3D false; return 0; } dentry =3D d_find_alias(inode); - path =3D ceph_mdsc_build_path(mdsc, dentry, ppathlen, pino, 1); + path =3D ceph_mdsc_build_path(mdsc, dentry, path_info, 1); dput(dentry); if (IS_ERR(path)) return PTR_ERR(path); - *ppath =3D path; - *pfreepath =3D true; + /* + * ceph_mdsc_build_path already fills path_info, including snap from dent= ry. + * Override with inode's snap since that's what this function is for. + */ + path_info->vino.snap =3D ceph_snap(inode); return 0; } =20 @@ -2901,26 +2921,31 @@ static int build_inode_path(struct inode *inode, */ static int set_request_path_attr(struct ceph_mds_client *mdsc, struct inod= e *rinode, struct dentry *rdentry, struct inode *rdiri, - const char *rpath, u64 rino, const char **ppath, - int *pathlen, u64 *ino, bool *freepath, + const char *rpath, u64 rino, struct ceph_path_info *path_info, bool parent_locked) { struct ceph_client *cl =3D mdsc->fsc->client; int r =3D 0; =20 + /* Initialize the output structure */ + memset(path_info, 0, sizeof(*path_info)); + if (rinode) { - r =3D build_inode_path(rinode, ppath, pathlen, ino, freepath); + r =3D build_inode_path(rinode, path_info); doutc(cl, " inode %p %llx.%llx\n", rinode, ceph_ino(rinode), ceph_snap(rinode)); } else if (rdentry) { - r =3D build_dentry_path(mdsc, rdentry, rdiri, ppath, pathlen, ino, - freepath, parent_locked); - doutc(cl, " dentry %p %llx/%.*s\n", rdentry, *ino, *pathlen, *ppath); + r =3D build_dentry_path(mdsc, rdentry, rdiri, path_info, parent_locked); + doutc(cl, " dentry %p %llx/%.*s\n", rdentry, path_info->vino.ino, + path_info->pathlen, path_info->path); } else if (rpath || rino) { - *ino =3D rino; - *ppath =3D rpath; - *pathlen =3D rpath ? strlen(rpath) : 0; - doutc(cl, " path %.*s\n", *pathlen, rpath); + path_info->vino.ino =3D rino; + path_info->vino.snap =3D CEPH_NOSNAP; + path_info->path =3D rpath; + path_info->pathlen =3D rpath ? strlen(rpath) : 0; + path_info->freepath =3D false; + + doutc(cl, " path %.*s\n", path_info->pathlen, rpath); } =20 return r; @@ -2997,11 +3022,8 @@ static struct ceph_msg *create_request_message(struc= t ceph_mds_session *session, struct ceph_client *cl =3D mdsc->fsc->client; struct ceph_msg *msg; struct ceph_mds_request_head_legacy *lhead; - const char *path1 =3D NULL; - const char *path2 =3D NULL; - u64 ino1 =3D 0, ino2 =3D 0; - int pathlen1 =3D 0, pathlen2 =3D 0; - bool freepath1 =3D false, freepath2 =3D false; + struct ceph_path_info path_info1 =3D {0}; + struct ceph_path_info path_info2 =3D {0}; struct dentry *old_dentry =3D NULL; int len; u16 releases; @@ -3011,25 +3033,42 @@ static struct ceph_msg *create_request_message(stru= ct ceph_mds_session *session, u16 request_head_version =3D mds_supported_head_version(session); kuid_t caller_fsuid =3D req->r_cred->fsuid; kgid_t caller_fsgid =3D req->r_cred->fsgid; + bool parent_locked =3D test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_fla= gs); =20 - ret =3D set_request_path_attr(mdsc, req->r_inode, req->r_dentry, - req->r_parent, req->r_path1, req->r_ino1.ino, - &path1, &pathlen1, &ino1, &freepath1, - test_bit(CEPH_MDS_R_PARENT_LOCKED, - &req->r_req_flags)); + ret =3D set_request_path_attr(mdsc, req->r_inode, req->r_dentry, + req->r_parent, req->r_path1, req->r_ino1.ino, + &path_info1, parent_locked); if (ret < 0) { msg =3D ERR_PTR(ret); goto out; } =20 + /* + * When the parent directory's i_rwsem is *not* locked, req->r_parent may + * have become stale (e.g. after a concurrent rename) between the time the + * dentry was looked up and now. If we detect that the stored r_parent + * does not match the inode number we just encoded for the request, switch + * to the correct inode so that the MDS receives a valid parent reference. + */ + if (!parent_locked && + req->r_parent && path_info1.vino.ino && ceph_ino(req->r_parent) !=3D= path_info1.vino.ino) { + struct inode *correct_dir =3D ceph_get_inode(mdsc->fsc->sb, path_info1.v= ino, NULL); + if (!IS_ERR(correct_dir)) { + WARN(1, "ceph: r_parent mismatch (had %llx wanted %llx) - updating\n", + ceph_ino(req->r_parent), path_info1.vino.ino); + iput(req->r_parent); + req->r_parent =3D correct_dir; + } + } + /* If r_old_dentry is set, then assume that its parent is locked */ if (req->r_old_dentry && !(req->r_old_dentry->d_flags & DCACHE_DISCONNECTED)) old_dentry =3D req->r_old_dentry; ret =3D set_request_path_attr(mdsc, NULL, old_dentry, - req->r_old_dentry_dir, - req->r_path2, req->r_ino2.ino, - &path2, &pathlen2, &ino2, &freepath2, true); + req->r_old_dentry_dir, + req->r_path2, req->r_ino2.ino, + &path_info2, true); if (ret < 0) { msg =3D ERR_PTR(ret); goto out_free1; @@ -3060,7 +3099,7 @@ static struct ceph_msg *create_request_message(struct= ceph_mds_session *session, =20 /* filepaths */ len +=3D 2 * (1 + sizeof(u32) + sizeof(u64)); - len +=3D pathlen1 + pathlen2; + len +=3D path_info1.pathlen + path_info2.pathlen; =20 /* cap releases */ len +=3D sizeof(struct ceph_mds_request_release) * @@ -3068,9 +3107,9 @@ static struct ceph_msg *create_request_message(struct= ceph_mds_session *session, !!req->r_old_inode_drop + !!req->r_old_dentry_drop); =20 if (req->r_dentry_drop) - len +=3D pathlen1; + len +=3D path_info1.pathlen; if (req->r_old_dentry_drop) - len +=3D pathlen2; + len +=3D path_info2.pathlen; =20 /* MClientRequest tail */ =20 @@ -3183,8 +3222,8 @@ static struct ceph_msg *create_request_message(struct= ceph_mds_session *session, lhead->ino =3D cpu_to_le64(req->r_deleg_ino); lhead->args =3D req->r_args; =20 - ceph_encode_filepath(&p, end, ino1, path1); - ceph_encode_filepath(&p, end, ino2, path2); + ceph_encode_filepath(&p, end, path_info1.vino.ino, path_info1.path); + ceph_encode_filepath(&p, end, path_info2.vino.ino, path_info2.path); =20 /* make note of release offset, in case we need to replay */ req->r_request_release_offset =3D p - msg->front.iov_base; @@ -3247,11 +3286,11 @@ static struct ceph_msg *create_request_message(stru= ct ceph_mds_session *session, msg->hdr.data_off =3D cpu_to_le16(0); =20 out_free2: - if (freepath2) - ceph_mdsc_free_path((char *)path2, pathlen2); + if (path_info2.freepath) + ceph_mdsc_free_path((char *)path_info2.path, path_info2.pathlen); out_free1: - if (freepath1) - ceph_mdsc_free_path((char *)path1, pathlen1); + if (path_info1.freepath) + ceph_mdsc_free_path((char *)path_info1.path, path_info1.pathlen); out: return msg; out_err: @@ -4615,14 +4654,17 @@ static int reconnect_caps_cb(struct inode *inode, i= nt mds, void *arg) =20 dentry =3D d_find_primary(inode); if (dentry) { + struct ceph_path_info path_info; /* set pathbase to parent dir when msg_version >=3D 2 */ - path =3D ceph_mdsc_build_path(mdsc, dentry, &pathlen, &pathbase, + path =3D ceph_mdsc_build_path(mdsc, dentry, &path_info, recon_state->msg_version >=3D 2); dput(dentry); if (IS_ERR(path)) { err =3D PTR_ERR(path); goto out_err; } + pathlen =3D path_info.pathlen; + pathbase =3D path_info.vino.ino; } else { path =3D NULL; pathbase =3D 0; diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 3e2a6fa7c19a..c4c1ea8d5f5e 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -623,8 +623,18 @@ static inline void ceph_mdsc_free_path(char *path, int= len) __putname(path - (PATH_MAX - 1 - len)); } =20 +/* + * Structure to group path-related output parameters for build_*_path func= tions + */ +struct ceph_path_info { + const char *path; + int pathlen; + struct ceph_vino vino; + bool freepath; +}; + extern char *ceph_mdsc_build_path(struct ceph_mds_client *mdsc, - struct dentry *dentry, int *plen, u64 *base, + struct dentry *dentry, struct ceph_path_info *path_info, int for_wire); =20 extern void __ceph_mdsc_drop_dentry_lease(struct dentry *dentry); --=20 2.34.1