Commit Graph

29 Commits

Author SHA1 Message Date
Xiao Pan
beebb0faa7 fix: read_history() multiple times will add repeat histories to history lists
Issue Description:

When sdcv found multiple items, whatever your choice is, sdcv will add
double the current history entries to history file. For example, if
current history is "a\nb", you search akjk and there's multiple results,
whatever you choose, even -1,  after this is done, the history file will
be rewritten to "a\nb\nakjk\na\nb", note here \n is newline character.
So if you have 500 lines of history, you search akjk and there's
multiple results, you choose -1, after done there's 1001 lines of
history.

How to reproduce:

You can download this dictionary file
https://github.com/skywind3000/ECDICT/releases/download/1.0.28/ecdict-stardict-28.zip
and put into your dictionary directory, on Arch, with AUR, you can
install from https://aur.archlinux.org/packages/stardict-ecdict. Then,
make sure you also add the dictionary name to ~/.config/sdcv_ordering if
you have one. Add some lines to your history file if you do not have.
Then search for "akjk" with sdcv, e,g, `sdcv akjk`, then it will prompt
you to choose, you can choose -1, then ctrl-d to exit. Expected result
is history file add akjk at the end. Actual result is history file now
contain original content + akjk + original content duplicate as I
described in the issue description.

Fix and reasons:

I'm a hobbyist and I'm not a professional, I haven't use C++ for years
so many of my writings is very likely wrong. After some trail and error,
I found that call read_history() multiple times will add repeat
histories to history lists. In the commit d2327e2, a new IReadLine
object is created (note name changes, also note I know this description
of a IReadLine object is wrong). Here's a permalink:
49c8094b53/src/libwrapper.cpp (L418).
The problem of this new IReadLine object `choice_readline` is it called
read_history() again from ./src/readline.cpp constructor, because
there's already a IReadLine object `io` constructed at ./src/sdcv.cpp.
When read_history() is called twice, there's a "history lists" read from
history file first then append from history file again, so when you
destruct IReadLine object with write_history() in ./src/readline.cpp,
the history file contain duplicate content after write. Here are
permalinks:
49c8094b53/src/readline.cpp (L88),
and
49c8094b53/src/readline.cpp (L94).

So my fix is to just to use `io` IReadLine object and not to create a
new `choice_readline` object.

Misc:

During my trial and errors process, I made an example code to show
read_history()'s weird behavior. I did not dig deeper, I just guess
maybe there's some kind of werid history list as mentioned in
https://tiswww.cwru.edu/php/chet/readline/history.html#History-List-Management.

Here's the example code a.c, note you need to include stdio.h,
readline/history.h, and readline/readline.h. I did not include them here
because commit message seems will make them comment.
```c
...
int main (void)
{
	rl_readline_name="learn_readline";
	using_history();
	read_history("/home/xyz/test/learn_readline/history.txt");
	write_history("/home/xyz/test/learn_readline/history2.txt");
	{
		rl_readline_name="learn_readline";
		using_history();
		read_history("/home/xyz/test/learn_readline/history.txt");
		write_history("/home/xyz/test/learn_readline/history3.txt");
	}
	write_history("/home/xyz/test/learn_readline/history4.txt");
	return 0;
}
```
Here's the content of history.txt:
```
a
b
```
After build and run, as you can guess, history2.txt is same as
history.txt. But history3.txt and history4.txt content are:
```
a
b
a
b
```

Signed-off-by: Xiao Pan <xyz@flylightning.xyz>
2024-08-15 13:08:33 +03:00
Aleksa Sarai
6d385221d0 lookup: return all matching entries found during lookup
Previously, we would just return the first entry we found that matched
the requested word. This causes issues with dictionaries that have lots
of entries which can be found using the same search string. In these
cases, the user got a completely arbitrary word returned to them rather
than the full set.

While this may seem strange, this is incredibly commonplace in Japanese
and likely several other languages. In Japanese:

 * When written using kanji, the same string of characters could refer
   to more than one word which may have a completely different meaning.
   Examples include 潜る (くぐる、もぐる) and 辛い (からい、つらい).

 * When written in kana, the same string of characters can also refer to
   more than one word which is written using completely different kanji,
   and has a completely different meaning. Examples include きく
   (聞く、効く、菊) and たつ (立つ、建つ、絶つ).

In both cases, these are different words in every sense of the word, and
have separate headwords for each in the dictionary. Thus in order to be
completely useful for such dictionaries, sdcv needs to be able to return
every matching word in the dictionary.

The solution is conceptually simple -- return a set containing the
indices rather than just a single index. Since every list we search is
sorted (to allow binary searching), once we find one match we can just
walk backwards and forwards from the match point to find the entire
block of matching terms and add them to the set in linear time. A
std::set is used so that we don't return duplicate results needlessly.

This solution was in practice a bit more complicated because .otf cache
files require a bit more fiddling, and also the ->lookup methods are
used by some callers to find the next entry if no entry was found. But
on the whole it's not too drastic of a change from the previous setup.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-11-14 22:38:26 +03:00
Evgeniy A. Dushistov
7facbe215e refactoring: run clang-format against code 2020-08-14 12:36:02 +03:00
alcah
021e467b37 return exit code 2 if search term not found 2020-03-17 22:15:16 +10:30
Evgeniy A. Dushistov
8f16ceae59 refactoring: apply clang-format rules 2017-08-09 07:46:27 +03:00
Evgeniy A. Dushistov
d0c0a0837f fix: do not give interactive menu via pager
fixes #28
2017-08-09 07:41:33 +03:00
Peter
e85927e562 Add -e for exact searches (no fuzzy matches).
Only exact matches (or synonyms) are returned for simple searches.
2017-07-28 11:39:34 +02:00
Peter
835dffcaf8 Add additional type identifiers h,w,k
Like for xdxf, no processing is done, the raw content is shown.
2017-07-27 08:15:45 +02:00
Evgeniy Dushistov
af6362f5df Merge pull request #23 from sleep-walker/master
fix FSF address in LICENSE
2017-07-27 00:30:30 +03:00
Tomáš Čech
98e98d0746 fix FSF address 2017-07-26 22:39:28 +02:00
Peter
3105823e8b Add option --json-output (-j)
If given -j, format the output of -l and of searches as JSON.
2017-07-26 22:07:23 +02:00
Anton Yuzhaninov
84367a5744 Fix using SDCV_PAGER
Stream opened with popen() should be closed with pclose() as documented
in popen(3) man.
2017-03-07 18:37:52 -05:00
Evgeniy Dushistov
c78d59de5f fixes for last commit 2014-10-24 18:03:30 +00:00
Evgeniy Dushistov
73664c078a add tiny support of KingSoft PowerWord's data 2014-10-23 23:23:48 +00:00
Evgeniy Dushistov
624208793e fix missed unescaped for &apos;, thanks to Svyatoslav Mishyn 2014-10-23 22:43:04 +00:00
Evgeniy Dushistov
ab8d6ec74e if search return more then one result
save choice to readline history
2014-05-11 02:20:03 +00:00
Evgeniy Dushistov
8298a578b0 check fread calls 2013-07-07 23:29:09 +00:00
Evgeniy Dushistov
5f8d2cb174 remove not used code, use glib wrappers where possible 2013-07-07 20:12:03 +00:00
Evgeniy Dushistov
9034e792b6 code cleanups + use where possible get_uint32 instead of not safe cast 2013-07-07 17:05:55 +00:00
Evgeniy Dushistov
f5c62baeb9 support of usage colors in sdcv output 2013-07-07 14:43:53 +00:00
Evgeniy Dushistov
3812fad586 c++11 for readline + libwrapper 2013-07-06 22:44:11 +00:00
Evgeniy Dushistov
d2327e2a0f Import patch from Roman Imankulov:
"sdcv" does not set up `rl_readline_name' variable which can be extremely useful for the writing the .inputrc file. Additionally sdcv does not use readline in the "Your choice [-1 to abort]" dialog. This patch fix both these issues, readline_name is set up to "sdcv".
This trick allows me to add into the .inputrc
$if sdcv
"\e\e": "-1\n"
$endif
and type double-escape in the "Your choice" dialog instead of pretty annoying "-1".
2013-07-06 13:40:11 +00:00
Evgeniy Dushistov
ab22f8eb41 replace array with variable size with vector,
this should help clang compiler to compile our source code
2013-07-06 12:52:48 +00:00
Evgeniy Dushistov
2a5da7969f fix warnings about size_t %d -> %zu (C99) 2013-07-06 10:18:21 +00:00
Evgeniy Dushistov
684a8cef34 fix typo, thanks to
Michal Čihař ( nijel )
2010-08-01 20:21:27 +00:00
Evgeniy Dushistov
3da4808990 fix build with gcc 4.3 2008-10-11 16:22:08 +00:00
Evgeniy Dushistov
c7c8dab8db get rid of getopt, because of it cause problem on mac os x. 2007-09-30 18:10:19 +00:00
Evgeniy Dushistov
e34edfcf24 sdcv should be able to handle pango markup type records
correctly
2007-08-14 18:35:42 +00:00
Evgeniy Dushistov
3f241bb6bb 0.4.2 release 2007-08-14 18:18:20 +00:00