Commit Graph

199 Commits

Author SHA1 Message Date
Evgeniy A. Dushistov
b74bc2478a chore: document integration with readline
fixes #27
2025-08-17 14:22:40 +03:00
Vitaly Zdanevich
58c48988f6 README.org: add fzf 2025-08-17 13:58:02 +03:00
NorwayFun
07cd873e9d po: Adding Georgian translation 2025-08-17 13:52:26 +03:00
Evgeniy A. Dushistov
4545473da9 refactor: use more clear way to concat strings 2025-08-17 13:51:53 +03:00
Evgeniy A. Dushistov
849f0ed1ac fix: memory leak instroduced in #110 2025-08-17 13:51:53 +03:00
Evgeniy A. Dushistov
d5e1eb4d93 test: use actions/checkout@v4 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
3a4b76124c test: make sure cmake 3.10 works 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
6eaebaaa2f test: use cmake from distributive 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
e24722b8fc test: ubuntu 20.04 is missing, use 22.04 instead 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
c57ef6e916 fix: use READLINE_(INCLUDE_DIR|LIBRARY) if WITH_READLINE==True 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
24c08365c4 fix: set given invalid arguments for CACHE mode: missing type or docstring 2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
8f77ede167 chore: update requirement for cmake to 3.10, to make modern cmake happy
and fix build on latest Ubuntu (24.04)
2025-08-17 13:32:56 +03:00
Evgeniy A. Dushistov
3a8ab1d5c3 test: install missed libglib2.0-dev on CI machine 2025-08-17 13:32:56 +03:00
Norayr Chilingarian
5887505185 Fix build with GCC 14 and modern glib: const correctness and deprecated API
- Use 'const gchar*' for result of g_utf8_next_char() to satisfy GCC 14's stricter const rules
- Remove incorrect g_free() on non-allocated pointer from g_utf8_next_char()
- Replace deprecated g_pattern_match_string() with g_pattern_spec_match_string()
2025-08-17 12:48:04 +03:00
Xiao Pan
beebb0faa7 fix: read_history() multiple times will add repeat histories to history lists
Issue Description:

When sdcv found multiple items, whatever your choice is, sdcv will add
double the current history entries to history file. For example, if
current history is "a\nb", you search akjk and there's multiple results,
whatever you choose, even -1,  after this is done, the history file will
be rewritten to "a\nb\nakjk\na\nb", note here \n is newline character.
So if you have 500 lines of history, you search akjk and there's
multiple results, you choose -1, after done there's 1001 lines of
history.

How to reproduce:

You can download this dictionary file
https://github.com/skywind3000/ECDICT/releases/download/1.0.28/ecdict-stardict-28.zip
and put into your dictionary directory, on Arch, with AUR, you can
install from https://aur.archlinux.org/packages/stardict-ecdict. Then,
make sure you also add the dictionary name to ~/.config/sdcv_ordering if
you have one. Add some lines to your history file if you do not have.
Then search for "akjk" with sdcv, e,g, `sdcv akjk`, then it will prompt
you to choose, you can choose -1, then ctrl-d to exit. Expected result
is history file add akjk at the end. Actual result is history file now
contain original content + akjk + original content duplicate as I
described in the issue description.

Fix and reasons:

I'm a hobbyist and I'm not a professional, I haven't use C++ for years
so many of my writings is very likely wrong. After some trail and error,
I found that call read_history() multiple times will add repeat
histories to history lists. In the commit d2327e2, a new IReadLine
object is created (note name changes, also note I know this description
of a IReadLine object is wrong). Here's a permalink:
49c8094b53/src/libwrapper.cpp (L418).
The problem of this new IReadLine object `choice_readline` is it called
read_history() again from ./src/readline.cpp constructor, because
there's already a IReadLine object `io` constructed at ./src/sdcv.cpp.
When read_history() is called twice, there's a "history lists" read from
history file first then append from history file again, so when you
destruct IReadLine object with write_history() in ./src/readline.cpp,
the history file contain duplicate content after write. Here are
permalinks:
49c8094b53/src/readline.cpp (L88),
and
49c8094b53/src/readline.cpp (L94).

So my fix is to just to use `io` IReadLine object and not to create a
new `choice_readline` object.

Misc:

During my trial and errors process, I made an example code to show
read_history()'s weird behavior. I did not dig deeper, I just guess
maybe there's some kind of werid history list as mentioned in
https://tiswww.cwru.edu/php/chet/readline/history.html#History-List-Management.

Here's the example code a.c, note you need to include stdio.h,
readline/history.h, and readline/readline.h. I did not include them here
because commit message seems will make them comment.
```c
...
int main (void)
{
	rl_readline_name="learn_readline";
	using_history();
	read_history("/home/xyz/test/learn_readline/history.txt");
	write_history("/home/xyz/test/learn_readline/history2.txt");
	{
		rl_readline_name="learn_readline";
		using_history();
		read_history("/home/xyz/test/learn_readline/history.txt");
		write_history("/home/xyz/test/learn_readline/history3.txt");
	}
	write_history("/home/xyz/test/learn_readline/history4.txt");
	return 0;
}
```
Here's the content of history.txt:
```
a
b
```
After build and run, as you can guess, history2.txt is same as
history.txt. But history3.txt and history4.txt content are:
```
a
b
a
b
```

Signed-off-by: Xiao Pan <xyz@flylightning.xyz>
2024-08-15 13:08:33 +03:00
Evgeniy A. Dushistov
49c8094b53 version 0.5.5 v0.5.5 2023-04-18 21:47:55 +03:00
Evgeniy A. Dushistov
4346e65bd3 fix CI build: ubuntu-18.04 not supported by github actions anymore 2023-04-18 21:44:18 +03:00
Evgeniy A. Dushistov
d144e0310c fix CI build 2023-01-16 16:44:09 +03:00
NiLuJe
6e36e7730c Warn on unknown dicts 2022-09-16 18:48:08 +03:00
NiLuJe
abe5e9e72f Check accesses to the bookname_to_ifo std::map
Avoid crashes when passing unknown dicts to the -u flag

Fix #87
2022-09-16 18:48:08 +03:00
NiLuJe
488ec68854 Use off_t for stuff mainly assigned to a stat.st_size value
Allows simplifying the mmap sanity checks in mapfile, and actually
ensuring they won't break when -D_FILE_OFFSET_BITS=64
2022-09-14 22:12:29 +03:00
Marcelino Alberdi Pereira
b698445ead Add a small summary of the project to the README 2022-09-07 17:51:13 +03:00
Evgeniy A. Dushistov
504e7807e6 add information about 0.5.4 into NEWS 2022-06-24 21:49:00 +03:00
Evgeniy A. Dushistov
6c80bf2d99 t_json: add data about new dictionary 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
8742575c33 fix bash syntax error 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
b294b76fb5 check file size before mapping on linux 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
823ec3d840 clang-format for mapfile 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
6ab8b51e6c version 0.5.4 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
881657b336 Revert "replace deprecated g_pattern_match_string function"
This reverts commit 452a4e07fb.
2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
911fc2f561 more robust parsing of ifo file
fixes #79 fixes #81
2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
f488f5350b stardict_lib.hpp: remove unused headers plus clang-format 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
e72220e748 use cmake to check if compiler supports c++11 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
b77c0e793a replace deprecated g_pattern_match_string function 2022-06-24 21:34:47 +03:00
Evgeniy A. Dushistov
ebaa6f2136 clang-format for stardict_lib.cpp 2022-06-24 21:34:47 +03:00
Aleksa Sarai
d054adb37c tests: add multiple results integration test
Make sure we return all of the relevant results, even in cases with
lots of results (larger than ENTR_PER_PAGE in the offset index) and
where you have a synyonym and headword present for the same word.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-11-14 22:38:26 +03:00
Aleksa Sarai
4a9b1dae3d stardict_lib: remove dead poGet{Current,Next,Pre}Word iterators
They aren't used at all by scdv, and thus aren't tested (meaning that
adaptions to the core lookup algorithms can be complicated because these
methods use them but aren't tested so there's no real way of knowing if
a change has broken the methods or not).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-11-14 22:38:26 +03:00
Aleksa Sarai
6d385221d0 lookup: return all matching entries found during lookup
Previously, we would just return the first entry we found that matched
the requested word. This causes issues with dictionaries that have lots
of entries which can be found using the same search string. In these
cases, the user got a completely arbitrary word returned to them rather
than the full set.

While this may seem strange, this is incredibly commonplace in Japanese
and likely several other languages. In Japanese:

 * When written using kanji, the same string of characters could refer
   to more than one word which may have a completely different meaning.
   Examples include 潜る (くぐる、もぐる) and 辛い (からい、つらい).

 * When written in kana, the same string of characters can also refer to
   more than one word which is written using completely different kanji,
   and has a completely different meaning. Examples include きく
   (聞く、効く、菊) and たつ (立つ、建つ、絶つ).

In both cases, these are different words in every sense of the word, and
have separate headwords for each in the dictionary. Thus in order to be
completely useful for such dictionaries, sdcv needs to be able to return
every matching word in the dictionary.

The solution is conceptually simple -- return a set containing the
indices rather than just a single index. Since every list we search is
sorted (to allow binary searching), once we find one match we can just
walk backwards and forwards from the match point to find the entire
block of matching terms and add them to the set in linear time. A
std::set is used so that we don't return duplicate results needlessly.

This solution was in practice a bit more complicated because .otf cache
files require a bit more fiddling, and also the ->lookup methods are
used by some callers to find the next entry if no entry was found. But
on the whole it's not too drastic of a change from the previous setup.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-11-14 22:38:26 +03:00
Evgeniy Dushistov
3d15ce3b07 Merge pull request #77 from cyphar/multi-word-lookups
lookup: do not bail on first failed lookup with a word list
2021-10-17 21:03:14 +03:00
Aleksa Sarai
51338ac5bb lookup: do not bail on first failed lookup with a word list
Due to the lack of deinflection support in StarDict, users might want to
be able to create a list of possible deinflections and search each one
to see if there is a dictionary entry for that deinflection.

Being able to do this in one sdcv invocation is far more preferable to
calling sdcv once for each candidate due to the performance cost of
doing so. The most obvious language that would benefit from this is
Japanese, but I'm sure other folks would prefer this.

In order to make this use-case better supported -- try to look up every
word in the provided list of words before existing with an error if any
one of the words failed to be looked up.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-09-29 03:28:44 +10:00
Evgeniy Dushistov
5ada75e08d Merge pull request #73 from 258204/json
Added --json (same as --json-output) to match man
2021-06-21 12:45:09 +03:00
258204
c7d9944f7d Added --json (same as --json-output) to match man 2021-06-19 19:19:31 -06:00
Evgeniy Dushistov
3963e358cd Merge pull request #68 from NiLuJe/glib-getopt
Handle "rest" arguments the glib way
2021-01-27 16:33:36 +03:00
NiLuJe
3b26731b02 Making glib thinks it's a filename instead of a string prevents the
initial UTF-8 conversion

At least on POSIX.

Windows is another kettle of fish. But then it was probably already
broken there.
2021-01-14 19:26:06 +01:00
NiLuJe
070a9fb0bd Oh, well, dirty hackery it is, then.
the previous approachonly works as long as locales are actually sane
(i.e., the test only passes if you *actually* have the ru_RU.KOI8-R
locale built, which the CI doesn't).
2021-01-12 04:37:07 +01:00
NiLuJe
8f096629ec Unbreak tests
glib already runs the argument through g_locale_to_utf8 with
G_OPTION_REMAINING
2021-01-12 04:16:03 +01:00
NiLuJe
25768c6b80 Handle "rest" arguments the glib way
Ensures the "stop parsing" token (--) is handled properly.
2021-01-12 03:35:55 +01:00
Evgeniy Dushistov
4ae4207349 Merge pull request #67 from doozan/master
Use binary search for synonyms, fixes #31
2020-12-23 04:30:13 +03:00
Jeff Doozan
994c1c7ae6 Use mapfile directly instead of buffer 2020-12-21 17:10:37 -05:00
Jeff Doozan
d38f8f13c9 Synonyms: Use MapFile 2020-12-21 08:53:29 -05:00
Jeff Doozan
cc7bcb8b73 Fix crash if dictionary has no synonyms 2020-12-19 18:37:15 -05:00