You may use a regular expression in regexp_replace
eg.
from pyspark.sql import functions as F
df = df.withColumn('sub_path',F.regexp_replace("path","^\\\\[a-zA-Z0-9]+\\[a-zA-Z0-9]+\\",""))
you may also be more flexible with this solution eg.
from pyspark.sql import functions as F
no_of_slashes=4 # number of slashes to consider here
# we build the regular expression by repeating `"[a-zA-Z0-9]+\\"`
# NB. We subtract 2 since we start with the frst 2 slashes
df = df.withColumn('sub_path',F.regexp_replace("path","^\\\\"+("[a-zA-Z0-9]+\\"*(no_of_slashes-2)),""))
Let me know if this works for you.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…