You can use regexp_replace
:
from pyspark.sql import functions as F
df1 = df.withColumn(
"number",
F.regexp_replace(F.col("number"), "(\d{3})(\d{7})(\d+)", "$1-$2-$3")
)
df1.show()
#+--------------+
#| number|
#+--------------+
#|123-4567890-12|
#+--------------+
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…