counting words in 2010

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

counting words in 2010

cfrees
As I understand it, TeXShop uses /usr/texbin/detex to calculate
document statistics (words, characters, lines). Specifically, it calls
detex via a wrapper script included in the application's resources. (In
my case, the wrapper has been "tweaked" but this is not relevant here.)

The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
Live 2010 as opposed to the versions supplied with TeX Live 2008 and
2009. Essentially, I'm getting much lower word counts than I should
because detex is stripping out text which it really shouldn't. The
things I'm certain about include footnote text and italicised text but
I suspect these are just a part of the problem.

I'm hoping this isn't intended to be a feature. Does anybody know:
- if this is a known (or unknown) bug?
- if there is any way of working around it? (I'm currently using the
   2009 issue of detex but that's a bit messy.)
- if there is a better way of getting document statistics?

Specifically, I need word counts which are as accurate as possible. But
if there is to be inaccuracy, it is generally better if the count is
reported as slightly higher than it really is rather than lower because
I'm typically trying to write stuff which does not exceed a given limit.
This makes the current detex almost useless.

I know detex is used for more than word counts but can't imagine what
purpose is served by stripping out italic text, for example. Please,
this isn't supposed to be a feature, is it? Please?!

This is also intended to alert people who rely on TeXShop's statistics
(or detex | wc) that the results may be unreliable with TeX Live 2010.
Perhaps I missed it, but I don't recall seeing any warnings to this
effect or information about changes to the current version of detex.
(If anybody saw such and can send me a pointer, that'd be great.)

Thanks,
cfr

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

David Derbes
I'm not sure if this is a solution, but Excalibur counts words as it spell-checks.

There is also WordService from Devon Technologies that works with many programs; it has a word count feature.
Free.

http://www.devon-technologies.com/products/freeware/services.htm

David Derbes
U of Chicago Laboratory Schools



On Oct 28, 2010, at 6:56 PM, Dr. Clea F. Rees wrote:

> As I understand it, TeXShop uses /usr/texbin/detex to calculate
> document statistics (words, characters, lines). Specifically, it calls
> detex via a wrapper script included in the application's resources. (In
> my case, the wrapper has been "tweaked" but this is not relevant here.)
>
> The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
> Live 2010 as opposed to the versions supplied with TeX Live 2008 and
> 2009. Essentially, I'm getting much lower word counts than I should
> because detex is stripping out text which it really shouldn't. The
> things I'm certain about include footnote text and italicised text but
> I suspect these are just a part of the problem.
>
> I'm hoping this isn't intended to be a feature. Does anybody know:
> - if this is a known (or unknown) bug?
> - if there is any way of working around it? (I'm currently using the
>  2009 issue of detex but that's a bit messy.)
> - if there is a better way of getting document statistics?
>
> Specifically, I need word counts which are as accurate as possible. But
> if there is to be inaccuracy, it is generally better if the count is
> reported as slightly higher than it really is rather than lower because
> I'm typically trying to write stuff which does not exceed a given limit.
> This makes the current detex almost useless.
>
> I know detex is used for more than word counts but can't imagine what
> purpose is served by stripping out italic text, for example. Please,
> this isn't supposed to be a feature, is it? Please?!
>
> This is also intended to alert people who rely on TeXShop's statistics
> (or detex | wc) that the results may be unreliable with TeX Live 2010.
> Perhaps I missed it, but I don't recall seeing any warnings to this
> effect or information about changes to the current version of detex.
> (If anybody saw such and can send me a pointer, that'd be great.)
>
> Thanks,
> cfr
> ----------- Please Consult the Following Before Posting -----------
> TeX FAQ: http://www.tex.ac.uk/faq
> List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
> List Archive: http://tug.org/pipermail/macostex-archives/
> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
> List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex
>

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

cfrees
On Thu 28th Oct, 2010 at 19:59, David Derbes seems to have written:

> I'm not sure if this is a solution, but Excalibur counts words as it spell-checks.

Hmm... I didn't know that. How accurate is it?

> There is also WordService from Devon Technologies that works with many programs; it has a word count feature.
> Free.
>
> http://www.devon-technologies.com/products/freeware/services.htm

I use this for paragraphs but it is no use where there's a lot of
markup or for entire documents because it doesn't filter out the TeX
stuff at all. But you are right that it is a very useful service to
have installed.

Thanks,
cfr

> David Derbes
> U of Chicago Laboratory Schools
>
>
>
> On Oct 28, 2010, at 6:56 PM, Dr. Clea F. Rees wrote:
>
>> As I understand it, TeXShop uses /usr/texbin/detex to calculate
>> document statistics (words, characters, lines). Specifically, it calls
>> detex via a wrapper script included in the application's resources. (In
>> my case, the wrapper has been "tweaked" but this is not relevant here.)
>>
>> The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
>> Live 2010 as opposed to the versions supplied with TeX Live 2008 and
>> 2009. Essentially, I'm getting much lower word counts than I should
>> because detex is stripping out text which it really shouldn't. The
>> things I'm certain about include footnote text and italicised text but
>> I suspect these are just a part of the problem.
>>
>> I'm hoping this isn't intended to be a feature. Does anybody know:
>> - if this is a known (or unknown) bug?
>> - if there is any way of working around it? (I'm currently using the
>>  2009 issue of detex but that's a bit messy.)
>> - if there is a better way of getting document statistics?
>>
>> Specifically, I need word counts which are as accurate as possible. But
>> if there is to be inaccuracy, it is generally better if the count is
>> reported as slightly higher than it really is rather than lower because
>> I'm typically trying to write stuff which does not exceed a given limit.
>> This makes the current detex almost useless.
>>
>> I know detex is used for more than word counts but can't imagine what
>> purpose is served by stripping out italic text, for example. Please,
>> this isn't supposed to be a feature, is it? Please?!
>>
>> This is also intended to alert people who rely on TeXShop's statistics
>> (or detex | wc) that the results may be unreliable with TeX Live 2010.
>> Perhaps I missed it, but I don't recall seeing any warnings to this
>> effect or information about changes to the current version of detex.
>> (If anybody saw such and can send me a pointer, that'd be great.)
>>
>> Thanks,
>> cfr
>> ----------- Please Consult the Following Before Posting -----------
>> TeX FAQ: http://www.tex.ac.uk/faq
>> List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
>> List Archive: http://tug.org/pipermail/macostex-archives/
>> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
>> List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex
>>
>
>

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Michael Sharpe
In reply to this post by cfrees

On Oct 28, 2010, at 4:56 PM, Dr. Clea F. Rees wrote:

> The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
> Live 2010 as opposed to the versions supplied with TeX Live 2008 and
> 2009. Essentially, I'm getting much lower word counts than I should
> because detex is stripping out text which it really shouldn't. The
> things I'm certain about include footnote text and italicised text but
> I suspect these are just a part of the problem.

In testing some short example files, it seemed that the contents of \emph{}, \textbf{} and so on are being counted, but not the contents of \footnote{}. However, footnote text built with the construction \begin{footnote}{}\end{footnote} seemed to be correctly counted. The man page for detex spells out the limited customizations that may be applied to detex.

Michael


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

C.H.E.
In reply to this post by cfrees
Command-line tool texcount is great for this. I use the -sum flag, like

texcount -sum foo.tex

and I'm happy with the results. However, I don't honestly know whether it is subject to the complaint you have with detex.
Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Daniel Becker-4
On 29.10.2010, at 06:11, C.H.E. wrote:
> Command-line tool texcount is great for this. I use the -sum flag, like
>
> texcount -sum foo.tex
>
> and I'm happy with the results. However, I don't honestly know whether it is
> subject to the complaint you have with detex.

I am using two engines for wordcounting that are based on texcount. They are shipped out with TeXShop, see

/Applications/TeX/TeXShop.app/Contents/Resources/TeXShop/Engines/Inactive/texcount/

Ramon wrote AppleScripts: http://www2.hawaii.edu/~ramonf/TeXShop/index.html

For me, texcount seems to the most accurate of the tools available. And it is actively developed.
http://folk.uio.no/einarro/Comp/texwordcount.html
has 2.3alpha, MacTeX 2010 comes with 2.2

Daniel
----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Daniel Becker-4
On 29.10.2010, at 08:15, Daniel Becker wrote:
>
> http://folk.uio.no/einarro/Comp/texwordcount.html
> has 2.3alpha, MacTeX 2010 comes with 2.2

The zip file there seems to be broken. The one from http://folk.uio.no/einarro/TeXcount/download.html can be unzipped.

Daniel
----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

cfrees
In reply to this post by Michael Sharpe
On Thu 28th Oct, 2010 at 18:59, Michael Sharpe seems to have written:

>
> On Oct 28, 2010, at 4:56 PM, Dr. Clea F. Rees wrote:
>
>> The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
>> Live 2010 as opposed to the versions supplied with TeX Live 2008 and
>> 2009. Essentially, I'm getting much lower word counts than I should
>> because detex is stripping out text which it really shouldn't. The
>> things I'm certain about include footnote text and italicised text but
>> I suspect these are just a part of the problem.
>
> In testing some short example files, it seemed that the contents of \emph{}, \textbf{} and so on are being counted, but not the contents of \footnote{}. However, footnote text built with the construction \begin{footnote}{}\end{footnote} seemed to be correctly counted. The man page for detex spells out the limited customizations that may be applied to detex.
Strange. detex is definitely taking out the contents of \emph{} here.
If I run detex without piping through wc, I can see the relevant words
are missing. (This is without passing detex any options.) Maybe we are
using different versions of detex? Earlier versions definitely did not
behave in this way.

Thanks,
cfr
> Michael
>
>

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Michael Sharpe

On Oct 30, 2010, at 12:46 PM, <[hidden email]> <[hidden email]> wrote:

> Strange. detex is definitely taking out the contents of \emph{} here.
> If I run detex without piping through wc, I can see the relevant words
> are missing. (This is without passing detex any options.) Maybe we are
> using different versions of detex? Earlier versions definitely did not
> behave in this way.

I'm using the intel 64-bit version of detex dated 7/13/10,
though I get the same results if I use the universal 32-bit version of detex dated 6/16/10.

In both cases, the detex output from the line

Test\footnote{ a footnote}  \emph{it} quickly.

is

Test  it quickly.
Michael
----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

cfrees
On Sat 30th Oct, 2010 at 13:34, Michael Sharpe seems to have written:

>
> On Oct 30, 2010, at 12:46 PM, <[hidden email]> <[hidden email]> wrote:
>
>> Strange. detex is definitely taking out the contents of \emph{} here.
>> If I run detex without piping through wc, I can see the relevant words
>> are missing. (This is without passing detex any options.) Maybe we are
>> using different versions of detex? Earlier versions definitely did not
>> behave in this way.
>
> I'm using the intel 64-bit version of detex dated 7/13/10,
> though I get the same results if I use the universal 32-bit version of detex dated 6/16/10.
My version is dated 17/6/2010. I'm definitely using the universal
32-bit version or it wouldn't be working at all.

> In both cases, the detex output from the line
>
> Test\footnote{ a footnote}  \emph{it} quickly.
>
> is
>
> Test  it quickly.

I can't now reproduce the disappearance of \emph text. Not even using
the same document. (I've edited it but not any of the TeX stuff.)

However, when I put your sample sentence into a test file and run
detex, I get:
  Test  a footnote  it quickly.
Although footnotes are still being deleted from my paper when run
through detex.

So now I'm just confused and have no idea what's going on. I get the
following results:
detex 2010 + wc: 208    3088   19174
detex 2009 + wc: 192    3215   20003
detex 2008 + wc: 192    3215   20003
texcount:
  Words in text: 3115
  Words in headers: 17
  Words in float captions: 121
  Number of headers: 5
  Number of floats: 0
  Number of math inlines: 8
  Number of math displayed: 0
All of these are being run without any customisation. I'm not surprised
to get a different result from texcount, of course. (Though I'd like to
know which way of counting is most accurate!) But I'm curious about the
different results from different versions of detex. The documentation
doesn't seem to be any different. I downloaded and compiled the source
and get the same results as those for 2010 above. That version is 2.8
but the documentation still refers to 2.6. So maybe changes made to 2.7
are responsible for the different results? (Versions 2.7 and 2.8 are
identical aside from the licence, I believe.)

Thanks,
Clea


> Michael

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Peter Dyballa
In reply to this post by cfrees

Am 30.10.2010 um 21:46 schrieb <[hidden email]>:

> Maybe we are using different versions of detex?

Maybe! I don't have the stable version from TL '10, but the detex  
versions from TL '08 'til '11 work OK, as expected.

Clea, can you describe exactly how you perform your tests?

--
Greetings

   Pete

No project was ever completed on time and within budget.
                                – Cheops Law


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Herbert Schulz

On Oct 30, 2010, at 5:21 PM, Peter Dyballa wrote:

>
> Am 30.10.2010 um 21:46 schrieb <[hidden email]>:
>
>> Maybe we are using different versions of detex?
>
> Maybe! I don't have the stable version from TL '10, but the detex versions from TL '08 'til '11 work OK, as expected.
>
> Clea, can you describe exactly how you perform your tests?
>
> --
> Greetings
>
>  Pete
Howdy,

You have the detex from TL'11? :-)

With TL-2010 if I run detex on

\documentclass{article}
\begin{document}
Hello World\footnote{ footnote}
\end{document}

I only get

Hello World

rather than the expected

Hello World footnote

isn't that what you get too?

Good Luck,

Herb Schulz
(herbs at wideopenwest dot com)




----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Peter Dyballa

Am 31.10.2010 um 00:43 schrieb Herbert Schulz:

> You have the detex from TL'11? :-)

Compiled from the sources...

>
> isn't that what you get too?

Well, knowing that Einstein was German (really? at least he spoke  
German!) and wanted that things were made simple, I test this simple  
way on the command line:

        echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but}  
quickly!"
        echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but}  
quickly!" | wc
        echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but}  
quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex | wc

In the last case I can use different versions of detex. The results  
from the three different command lines are:

        Please test\footnote{\textbf{this} footnote}, \emph{but} quickly!
               1       5      66
               1       6      40

The first line shows that the syntax chosen is OK, the second line  
counts the run-together words OK (second figure, first one is the  
number of lines, last one that of the characters of the input line),  
and the last line is correctly filtered by detex. This command shows  
how it correctly filters:

        echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but}  
quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex
        Please test this footnote, but quickly!

Again, I do not have the official/ready/finished detex version of TL  
'10. Maybe this file is defective...


Well, which "detex" are you actually using? One via a TeXShop engine?  
Could you add to that engine file:

        echo -n "The detex programme soon to be used is certainly this one:  
" ; which detex

Thw output will appear in the console window, together with the word  
count.


Does the word count come closer to the expected value with a text body  
of

        Hello World\footnote{ footnote} and more

or

        Hello World\footnote{ footnote and more}

or

        Hello World\footnote{ footnote}!

--
Greetings

   Pete

November, n.:
        The eleventh twelfth of a weariness.
                – Ambrose Bierce, "The Devil's Dictionary"


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Michael Sharpe

On Oct 30, 2010, at 4:12 PM, Peter Dyballa wrote:

>
> Am 31.10.2010 um 00:43 schrieb Herbert Schulz:
>
>> You have the detex from TL'11? :-)
>
> Compiled from the sources...
>
>>
>> isn't that what you get too?
>
> Well, knowing that Einstein was German (really? at least he spoke German!) and wanted that things were made simple, I test this simple way on the command line:
>
> echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but} quickly!"
> echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but} quickly!" | wc
> echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but} quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex | wc
>
> In the last case I can use different versions of detex. The results from the three different command lines are:
>
> Please test\footnote{\textbf{this} footnote}, \emph{but} quickly!
>       1       5      66
>       1       6      40
>
> The first line shows that the syntax chosen is OK, the second line counts the run-together words OK (second figure, first one is the number of lines, last one that of the characters of the input line), and the last line is correctly filtered by detex. This command shows how it correctly filters:
>
> echo "Please test\\footnote{\\textbf{this} footnote}, \\emph{but} quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex
> Please test this footnote, but quickly!
>
> Again, I do not have the official/ready/finished detex version of TL '10. Maybe this file is defective...
>
>
> Well, which "detex" are you actually using? One via a TeXShop engine? Could you add to that engine file:
>
> echo -n "The detex programme soon to be used is certainly this one: " ; which detex
>
> Thw output will appear in the console window, together with the word count.
>
>
> Does the word count come closer to the expected value with a text body of
>
> Hello World\footnote{ footnote} and more
>
> or
>
> Hello World\footnote{ footnote and more}
>
> or
>
> Hello World\footnote{ footnote}!
>
There is a difference between what you are doing and what I think the rest of us are doing. The command line

echo "Please test\\footnote{\\textbf{this} footnote},\\emph{but} quickly." | detex

runs detex in plain TeX mode, and in my case, the result is

Please test this footnote,but quickly.

which, with 6 words, is identical to your result. However, if run in latex mode, as it would be from the point of view of detex, if it were part of  a document containing a\begin{document}, the result is the same as running

echo "Please test\\footnote{\\textbf{this} footnote},\\emph{but} quickly." | detex -l

(the -l forces LaTeX mode), the result is

Please test ,but quickly.

It seems that in LaTeX mode, detex ignores plain tex commands that have a LaTeX version, and if you write

echo "Please test\\begin{footnote}{\\textbf{this} footnote}\\end{footnote},\\emph{but} quickly." | detex -l

the result is

Please testthis footnote,but quickly.

This does of course give an incorrect count because words were run together in LaTeX mode that were not run together in plain tex mode. I would consider thia a bug in detex. (I'm using the x86-64 version that came with TeXLive 2010, but I get exactly the same result with detex from the 2008 distribution.)

Michael




----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Peter Dyballa

Am 31.10.2010 um 23:55 schrieb Michael Sharpe:

> This does of course give an incorrect count because words were run  
> together in LaTeX mode that were not run together in plain tex mode.  
> I would consider thia a bug in detex. (I'm using the x86-64 version  
> that came with TeXLive 2010, but I get exactly the same result with  
> detex from the 2008 distribution.)


I can confirm that detex from the test or pre-test versions of TL '10  
and '11 shows the faulty behaviour you describe and Clea first found.  
I cannot confirm the behaviour in TL '08, for me it's correct. I have  
PPC hardware and PPC binaries, except for TL '10 pre, which are UB.

The same faulty behaviour appears when using -w (or -wt) and -wl: In  
the LaTeX case the footnote text is removed. (BTW, after the comma and  
before \\emph I had inserted a TAB. In your output Michael, you can  
see the SPACE character which is substituted for the footnote text,  
see later.)


The problem is also with the test file for detex:

        \documentclass{article}
        \begin{document}
       
        This is the first paragraph.
       
        \section{First Section}
       
        Preamble of Sect.~1.
       
        \subsection{A Subsection}
       
        Here some text, an inline formula $(a+b)^2=a^2+2ab+b^2$, as well
        as a displayed equation
        %
        \begin{equation}
        e^{\pm ix}=\cos x \pm i \sin x\;,
        \end{equation}
        %
        and some more text.
       
        Now some verbatim text \verb|a b c|.  That's all, folks.
       
        \end{document}

There is no \footnote{} in it...


There is also another bug in detex, I think. I looked into the (F)LEX  
output, which has:

        <Normal>"\\part"{Z} ;
        <Normal>"\\section"{Z} ;
        <Normal>"\\subsection"{Z} ;
        <Normal>"\\subsubsection"{Z} ;
        <Normal>"\\paragraph"{Z} ;
        <Normal>"\\sunparagraph"{Z} ;

Well, I *can* understand why in November one thinks of the sun – but  
it's wrong! Later this line comes:

        <Normal>"\\footnote" {KILLARGS(1); SPACE;}

Here we have it: a footnote is not assumed to be counted...


Other bugs, a "d" too much in "and":

        ErrorExit("-e option requires and argument");

and too obvious:

            ErrorExit("The environtment list contains too many environments");


At daylight I'll send a message to the TeX Live list, I won't mention  
the footnote problem. Clea, are you going to send a message yourself?

--
Greetings

   Pete

November, n.:
        The eleventh twelfth of a weariness.
                – Ambrose Bierce, "The Devil's Dictionary"


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

cfrees
On Mon 1st Nov, 2010 at 01:02, Peter Dyballa seems to have written:

>
> Am 31.10.2010 um 23:55 schrieb Michael Sharpe:
>
>> This does of course give an incorrect count because words were run together
>> in LaTeX mode that were not run together in plain tex mode. I would
>> consider thia a bug in detex. (I'm using the x86-64 version that came with
>> TeXLive 2010, but I get exactly the same result with detex from the 2008
>> distribution.)
>
>
> I can confirm that detex from the test or pre-test versions of TL '10 and '11
> shows the faulty behaviour you describe and Clea first found. I cannot
> confirm the behaviour in TL '08, for me it's correct. I have PPC hardware and
> PPC binaries, except for TL '10 pre, which are UB.
I also don't see the problem with TL '08. Similarly for TL '09. It
first appears for me in TL '10. And I am also on PPC.

> The same faulty behaviour appears when using -w (or -wt) and -wl: In the
> LaTeX case the footnote text is removed. (BTW, after the comma and before
> \\emph I had inserted a TAB. In your output Michael, you can see the SPACE
> character which is substituted for the footnote text, see later.)
>
>
> The problem is also with the test file for detex:
>
> \documentclass{article}
> \begin{document}
>
> This is the first paragraph.
>
> \section{First Section}
>
> Preamble of Sect.~1.
>
> \subsection{A Subsection}
>
> Here some text, an inline formula $(a+b)^2=a^2+2ab+b^2$, as well
> as a displayed equation
> %
> \begin{equation}
> e^{\pm ix}=\cos x \pm i \sin x\;,
> \end{equation}
> %
> and some more text.
>
> Now some verbatim text \verb|a b c|.  That's all, folks.
>
> \end{document}
>
> There is no \footnote{} in it...
>
>
> There is also another bug in detex, I think. I looked into the (F)LEX output,
> which has:
>
> <Normal>"\\part"{Z} ;
> <Normal>"\\section"{Z} ;
> <Normal>"\\subsection"{Z} ;
> <Normal>"\\subsubsection"{Z} ;
> <Normal>"\\paragraph"{Z} ;
> <Normal>"\\sunparagraph"{Z} ;
>
> Well, I *can* understand why in November one thinks of the sun – but it's
> wrong! Later this line comes:
>
> <Normal>"\\footnote" {KILLARGS(1); SPACE;}
>
> Here we have it: a footnote is not assumed to be counted...
>
>
> Other bugs, a "d" too much in "and":
>
> ErrorExit("-e option requires and argument");
>
> and too obvious:
>
>    ErrorExit("The environtment list contains too many
> environments");
>
>
> At daylight I'll send a message to the TeX Live list, I won't mention the
> footnote problem. Clea, are you going to send a message yourself?
I can certainly do that.

Thanks,
Clea

> --
> Greetings
>
> Pete
>
> November, n.:
> The eleventh twelfth of a weariness.
> – Ambrose Bierce, "The Devil's Dictionary"
>
----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Peter Dyballa

Am 01.11.2010 um 01:11 schrieb <[hidden email]> <[hidden email]
 >:

> I can certainly do that.

Could be there is some sense in neglecting footnotes, maybe there is  
some, presumingly old, tradition to not count footnotes as text since  
they are small-print... The readers on the TeX Live list might know  
the reason for this. And they can decide to stop neglecting footnotes,  
maybe by an option?

--
Greetings

   Pete

Upgraded, adj.:
        Didn't work the first time.


----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Reply | Threaded
Open this post in threaded view
|

Re: counting words in 2010

Mervyn Thomas
For me the need to count words in footnotes varies. Sometimes I need a count with footnotes, sometimes a count without footnotes.
An option would be very useful.

Thanks again to the developers for a really beautiful product.

Mervyn Thomas









On 01/11/2010, at 10:35 AM, Peter Dyballa wrote:

>
> Am 01.11.2010 um 01:11 schrieb <[hidden email]> <[hidden email]>:
>
>> I can certainly do that.
>
> Could be there is some sense in neglecting footnotes, maybe there is some, presumingly old, tradition to not count footnotes as text since they are small-print... The readers on the TeX Live list might know the reason for this. And they can decide to stop neglecting footnotes, maybe by an option?
>
> --
> Greetings
>
>  Pete
>
> Upgraded, adj.:
> Didn't work the first time.
>
> ----------- Please Consult the Following Before Posting -----------
> TeX FAQ: http://www.tex.ac.uk/faq
> List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
> List Archive: http://tug.org/pipermail/macostex-archives/
> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
> List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex
>

----------- Please Consult the Following Before Posting -----------
TeX FAQ: http://www.tex.ac.uk/faq
List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
List Archive: http://tug.org/pipermail/macostex-archives/
TeX on Mac OS X Website: http://mactex-wiki.tug.org/
List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex