用字典分离后缀
【腾讯云】亏本大甩卖,服务器4核16G 1年370元(带宽12M,系统盘120GB SSD盘,月流量2000GB)!!!!!!
云产品 配置 价格
服务器 1核2G,带宽5M,系统盘50GB SSD盘,月流量500GB 38元/年
MySQL 1核1G 19元/年
服务器 16核32G,带宽18M,系统盘250GB SSD盘,月流量5000GB 1197元/年
点我进入腾讯云,查看更多详情

I need to separate all possible suffixes (about 1000) from a given word. I am thinking about using a dict.

In doing so I would have suffixes as keys (and some additional information about the suffixes as values needed in the further process). If the longest possible suffix is 4 letters long I would search the dict for all possible combinations. For example: Given a word: 'abcdefg' I would search the dict for 'g','fg','efg' and 'defg'.

I have done some research and haven't found much similar uses of the dict. Could this be a viable solution or am I missing something here? Help much appriciated.

#0

If the suffixes aren't too long, your solution sounds fine -- it's only a few dictionary look-ups per word, and dictionary look-ups are fast. I don't think any more complex solution (like using a trie) would be worth it here. For only removing the suffix, you could also use a set instead of a dictionary, but since you need additional information for each suffix, a dictionary seems to be the natural choice.

#1

The simplest (probably not fastest) way would be to find all matches in a list. With 1000 items, you shouldn't have much trouble with performance.

>>> sufx = ['foo', 'bar']
>>> [s for s in sufx if 'bazbar'.endswith(s)]
['bar']
>>>[s for s in sufx if 'bazbaz'.endswith(s)]
[]
>>> [s for s in sufx if 'bazfoo'.endswith(s)]
['foo']

#2

See Time Complexity of a dict. Lookup times for a dict are quite fast (O(1) on average!). For this implementation, your average time complexity for finding the longest suffix would be O(k^2), with k being the length of your word. It is k^2 due to the ''.join operation (a similar O(n) operation like reversed or string slicing would be required, as strings do not support an O(1) appendleft operation).

Simple way of doing it (tested for python 3):

>>> from collections import deque
>>> word = "antidisestablishmentarianism"
>>> suffixes = {'ism': 3, 'anism': 6, 'ment': 4, 'arianism': 12}
>>> suffix = deque()
>>> longest = None
>>> for char in reversed(word):
...     suffix.appendleft(char)
...     suf = ''.join(suffix)
...     if suf in suffixes:
...         longest = suf
...
>>> longest
'arianism'

#3

I'm not sure I understand your usecase correctly. I guess it is about the fact that you are handling suffixes and they are hard to detect.

A typical approach (typically in indexing situations) would be to turn your string around and handle the suffix as a prefix. Then you can do a simple binary search in a sorted list of your reversed suffixes (thus prefixes).

#4

If I understand what you want to do, you should be using the re module in the standard lib.

Docs are here:

http://docs.python.org/library/re.html#module-re

There's an example regarding adverbs here:

http://docs.python.org/library/re.html#finding-all-adverbs

As for storing them as keys in a dict, seems fine to me. Especially, if you want to do some other processing for words that have the suffixes you care about.

推荐文章

如何判断HTML元素是否已离开页面底部?

如何判断HTML元素是否已离开页面底部?

推荐文章

当我通过tcp发送一个包时,它被分成两个包

当我通过tcp发送一个包时,它被分成两个包

推荐文章

浅谈Windows Azure 存储服务

浅谈Windows Azure 存储服务

推荐文章

PHP/MySQLi:将lc_time_names and DATE_FORMAT()设置为MySQLi查询?

PHP/MySQLi:将lc_time_names and DATE_FORMAT()设置为MySQLi查询?

推荐文章

[Translation] Introduction to ASP.NET Core

[Translation] Introduction to ASP.NET Core

推荐文章

当我使用wcf测试客户机时,如何让我的wcf服务更新?

当我使用wcf测试客户机时,如何让我的wcf服务更新?

推荐文章

创建IIS网站的代码

创建IIS网站的代码

推荐文章

第5章分布式系统模式 Singleton

第5章分布式系统模式 Singleton

推荐文章

查找表中大于某个值的最小值

查找表中大于某个值的最小值

推荐文章

无法创建Microsoft visual C#2008编译器。请重新安装Visual Studio

无法创建Microsoft visual C#2008编译器。请重新安装Visual Studio

推荐文章

制作支持UEFI的Windows8 PE (4)

制作支持UEFI的Windows8 PE (4)

推荐文章

在不同的数据库中存储文件?

在不同的数据库中存储文件?

推荐文章

独立WPF筛选器控件

独立WPF筛选器控件

推荐文章

Yahoo! 正在测试 Google 搜索引擎结果

Yahoo! 正在测试 Google 搜索引擎结果

推荐文章

实现审计跟踪-Spring AOP vs.Hibernate Interceptor vs DB Trigger

实现审计跟踪-Spring AOP vs.Hibernate Interceptor vs DB Trigger

推荐文章

Visual studio 2008只运行一个web应用程序,而不是解决方案中的所有应用程序

Visual studio 2008只运行一个web应用程序,而不是解决方案中的所有应用程序