Возможный способ сделать это:
df = pd.DataFrame([
[0, "008457", "02", "hello"],
[1, "990037", "05", "I"],
[2, "774426", "10", "am"],
[3, "564389", "08", "sleeping"],
[4, "009124", "17", "today"],
[5, "000029", "13", "is"],
[6, "548751", "21", "a"],
[7, "479903", "19", "bright"],
[8, "897054", "08", "sunny"],
[9, "336588", "7", "day"],
[10, "294260", "16", "today"],
[11, "908751", "29", "is"],
[12, "558902", "81", "rainy"],
[13, "097856", "19", "with"],
[14, "110044", "24", "cold"],
[15, "775098", "16", "today"],
[16, "665490", "02", "is"],
[17, "887099", "07", "sunday"],
[18, "389011", "18", "ahhh"],
[19, "675510", "11", "weekend"]
],
columns=["idx", "id_0", "user", "string"]
)
df = df.set_index('idx')
df1 = pd.DataFrame([
[0, "today"],
[1, "is"],
[2, "a"],
[3, "bright"],
[4, "sunny"],
[5, "day"]
],
columns=["idx", "string"]
)
matching_indices = []
for i in range(len(df)-len(df1)+1):
if (df.string.iloc[i:i+len(df1)].values == df1.string.values).all():
matching_indices += list(range(i,i+len(df1)))
df.iloc[matching_indices]
С выводом:
id_0 user string
idx
4 009124 17 today
5 000029 13 is
6 548751 21 a
7 479903 19 bright
8 897054 08 sunny
9 336588 7 day
Приведенный выше код вернет все совпадающие подпоследовательности с их правильными индексами, а не только первое вхождение.
Если вы wi sh возвращаете только первое вхождение, вы можете разбить l oop при первом обнаружении совпадения, как показано ниже:
matching_indices = []
for i in range(len(df)-len(df1)+1):
if (df.string.iloc[i:i+len(df1)].values == df1.string.values).all():
matching_indices += list(range(i,i+len(df1)))
break
df.iloc[matching_indices]