Weaponizable Technologies

250px-Panopticon

Weapon are devices

That can harm people

Can also harm property

But that’s less important

 

Weapons are technologies

Not necessarily physical

As in the Foucauldian sense

 

In that sense,

They can also

Harm society

And culture,

Civilizations

Humanity itself

 

And,

More importantly

The very idea of

What humanity is

 

In the Foucauldian sense, they

Can generate chain reactions

Just like nuclear technologies

And they can destroy humanity

Just like fission-fusion weapons

 

Weapons or technologies

Are not tied to a particular

Ideology or even a religion

 

In the Foucauldian sense,

Conventional technologies

 

Are clandestinely

Or benevolently

Developed, and

Are weaponized

 

They are proliferated

Then are exposed

Are opposed, and

Then, gradually

Are normalized

Are assimilated

Into our social fabric

 

The protests against the weapons

And weaponized technologies

As in the world we have made

Not necessarily in the world

That we could perhaps make

Are very predictable phonomena

 

They can start out very strong

Then they become a shadow of

Themselves, or even a parody

 

At best they can become, and

Exist for a longish time, even

Perhaps with ups and downs

 

With limited longish term achievements

Or with very impressive short term ones

Or with no effect on the status quo at all

 

A connoisseur’s delight

They often are reduced to

 

At worst they may become

Freak shows on the fringes

As Kipling showed in a story

Even if they are genuine

Not the fake ones: A part

Of Manufactured Dissent

 

A protest is like a lot like a balm

A protest that is for a single issue

Or, at most, a few such issues

For the people who are hurting

 

In that sense, they are a good thing

But pardon me, for I feel duty bound

To spoil the positivity with some

Unallied and honest bit of truth

 

For they are mostly just balms

That give temporary relief

From the symptoms only

 

They are necessary, but not sufficient

They are not cures in the end

And they come at the expense

Of some other people, who are

Also very much hurting, and

Their issues, symptomatically,

Can be very much different

 

In fact, they can be the exact

Contraries of the issues of the

First set of people who are hurting

 

The powers that be are apt to play

The one against the other, and

The little or large bits of evil

In all of us, ensures that we play

That game, of our own volition

Collectively, so that none feels guilty

 

On our own initiative even, or

So we might convince ourselves

 

Weaponised technologies then

Not just weaponizables ones

 

Are morally

And ethically

And legally

Sanctioned finally

 

That means that

They are approved

By general society

 

And they become

An integral part

A necessary part

Of the civilization

 

They are never

Ever sufficient

 

They become fait accompli

Which is a terrifying phrase

 

After enough time

They are taken

For granted

Are not even

Noticed in our

Everyday life

 

Most of us forget what they mean

Or what they are, how they work

They become part of our natural

Reality, our very natural universe

 

Who can use weapons?

 

Anyone can use them

If they can get access

 

To them, somehow, anyhow

 

And they will be used

Later on, if not sooner

Over there, if not here

At least in the beginning

 

The good guys can use them

Or those who claim to be so

We all know what that means

 

The bad guys can use them

The ugly guys can use them

The evil guys can use them

 

Individually evil can use them

Collectively evil can use them

 

More likely the latter

 

Anyone anywhere anytime on

The whole political spectrum

Can use them, if less or more

Individually or collectively

 

More likely the latter

 

There is absolutely

No guarantee that

Any of the above

Or indeed all of them

Can’t use them at all

Ever and anywhere

 

But can the weak and the meek

Or the tired and the poor

Use them as much as the

Strong and the powerful

To the same extent, even

For the purpose of self-defense?

 

Can single individuals use them

As much as the collective

To the same extent, even

For the purpose of self-defense?

 

First they are used over there

On those we don’t care about

Then they are used over here

 

And when that happens

There are fresh protests

 

We all care about ourselves

Even if we don’t about them

 

Once again, they

Are exposed: For us

Are opposed, and

Then, gradually

Are normalized

Are assimilated

Into our social fabric

Our very own life

 

Excluding them over there

They are already included

We still don’t care about them

We still care only for ourselves

 

Like before, again

They are morally

And ethically

And legally

Sanctioned finally

 

This time, however

For us, not just them

 

Some weaponized technologies

Are so totally unthinkably evil

That their existence is not even

Acknowledged, for preserving

Collective sense of being good

 

Such technologies are only used

Clandestinely, outside all records

So they leave no evidence at all

 

Who do they mean to target?

The demonized are targeted

Mentally-ill may be targeted

Truly subversive freethinkers

May be targeted, selectively

Misfits and loners can also

Be targeted with these ones

 

And, above all

 

The uncontaminated

(Unalloyed, if you like

Or unallied, if you like)

The incorrigible

Truth seekers, As

They may be called

Justice seekers also

Unalloyed or unallied

Can be targeted with

These unacknowledged

Weaponized technologies

In the Foucauldian sense

 

For The Greater Good

Seems they are called

Coal Mine Canaries

Freelance Test Rats

They may not be paid

May not even consent

 

They don’t even know this

That have been made that

This is the most evil part

Of the scheme, in which

 

All “schematism” had to be avoided

 

So they can’t even share

Without anyone at all

Let alone lodge a protest

 

They become Dead Canaries

If they come uncomfortably

Close to the truths that matter

 

In fact, these technologies

Are, by their very nature

Made only for selective use

Personalization is their

Key feature, their identifier

 

One of them had even

Got put on the record

Perhaps due to naïveté

It was called Zersetzung

It specifically recorded

Naïvely, as it turned out

It specifically wrote down

 

This kind of weaponised technology

Is a collective, organised and mobilised

Version of what is called gaslighting

 

A later version of it was called COINTELPRO

Who knows how many different versions of it

Exist today in how many places

Officially or unofficially

Recorded or unrecorded

 

In the original version called Zersetzung

All “schematism” had to be avoided

Because that would make opposition

And protest against it easily possible

 

It being: The collective using it?

 

Individual simply can’t use it

Not to the same degree and reach

Not anywhere remotely close

 

Or the technology itself only?

 

Or why not both of them?

 

But we had better not forget

Technologies are the means

Religions and ideologies are

About the ends, not the means

For them, practically speaking

Ends always justify the means

 

Even if they are, unthinkably

Unredeemably, only pure evil

 

However, we are all endowed with

The extreme powers of self-deception

Individually yes, but also collectively

 

So we still manage to think that they

Are still for them, over there, not us

They are within our society, never us

They are still for them, not over here

Over there can be much nearer now

But it is still over there, and for them

 

Thus, once more magically

They become fait accompli

With a very different context

But actually the same context

 

They are always necessary

So it is claimed, benevolently

But they are never sufficient

 

This is a universal theorem

If you like to be very precise

Then it is at the very least

A pretty likely conjecture

 

And so we march on forward

Or even backward oftentimes

Or sideways, if necessary

Which can be very effective

If you know what I mean

 

In search of new weapons

And ever new technologies

 

That can be weaponized

Easily and yes, inevitably

Even if you don’t believe

In Inevitabilism at all

 

What really is inevitable

However, is the fact that

Some weak, or the meek

Or an isolated individual

Perhaps crazy, perhaps not

Will use them occasionally

Usually after provocation

But sometimes without it

 

Or some collective

Rogue or not rogue

 

A matter of definition

 

Will also make use of them

Regularly or occasionally

 

That is a great opportunity

A motivation for finding

Implementing and using

Ever more lethal weapons

Weaponized technologies

And some non-lethal ones

In the Foucauldian sense

 

We find new evils

We define new evils

We create new evils

 

We get new weapons

To fight newest evils

Which creates even

More ever new evils

 

Thus the circle of evil

Closes in upon us all

Over there, over here

 

So what do you think about it?

***

Originally published on 14th August, 2019. Updated on 20th September, 2019.

A Challenge for RTI Activists in India

There is a major issue that most people, including activists in India have not given as much attention as it merits. That issue is of surveillance of ordinary people, especially within offices, gated societies, campuses and in some cases even independent houses. The use of electronic devices for surveillance is far more widespread than the occasionally reported phone tapping cases. Potentially, and I think in reality too, this is hampering all kind of normal activities that people can indulge in, including acts of dissidence and protest, which I think are the special target of such practices. It has come to the point where any kind of protest activity in India is being ‘nipped in the bud’, at least in urban areas. This is making all the talk about there being democracy in India a joke.

Whether or not I am wrong in saying the above, there is sufficient evidence about the potential and real misuse of surveillance devices. This is part of a worldwide trend that has intensified in the last ten years and many such cases have been reported in various countries, including by the mainstream media, which usually avoids such topics these days. One concrete, practical action that can be taken in this regard is to demand information about this under the Right to Information Act. Since I am not competent enough to do this on my own and I have no contacts of any sort whose help I can take, I challenge (or appeal, whichever way you like to see it) the RTI activists to demand this information from the government as well as corporations.

I list below some specific points which I think should form the basis for such a demand. I only write them down here as rough indicators.

  1. Has the government sanctioned the use of electronic surveillance devices against ordinary people? It yes, who gives authorisation in specific cases and on what basis? What guidelines are followed? Who verifies that these guidelines are followed? Is there any mechanism through which the targeted person can ask for justification for any such surveillance?
  2. Are these devices being used in hotels, hostels, campuses and offices? What safeguards are there against their misuse? Who looks after this? On what basis are these places identified? Are they also being used in independent houses? If yes, what are the details?
  3. Are local administrators or managers or private security agencies allowed to make their own policies regarding this, ignoring any consideration for privacy of individuals? What is the mechanism through which information can be obtained about this and how can any redressal be sought?
  4. Are there any constraints about sharing the information collected through these means? Who decides about such things? Has it become a complete free for all where any administrator or manager or private security company can collect and disseminate such information?
  5. What is the role of IT companies in this, especially outsourcing companies such as TCS, Wipro, Infoys, who have huge numbers of employees, many of whom at any given time are not engaged in productive work? Are these employees being involved in unauthorised and illegal surveillance on ordinary people? What are the details about this, how can they be obtained? If this is happening, does the government know about it and was this officially sanctioned by the government?
  6. Is the information (or any falsified/distorted version of it) collected through surveillance (by whichever agency) being used for punitive purposes against people who are seen to be (rightly or wrongly, with justification or without justification) indulging in some kind of dissidence activity such as opposing the policies of privatisation and corporatisation of everything? If yes, what is the legal basis for this?
  7. Is such information being used to disrupt services such as Internet access and electricity supply for people who are being targeted by the surveillance policies?
  8. Is such information being used to launch smear campaigns against people seen as opposed to the official or corporate policies?
  9. Is such information being used to generally “make life impossible” (as one think tank writer proudly mentioned in one of his articles: on a dissident media website, no less) for the targeted people?
  10. Is such information being given to shopkeepers, hair dressers etc., with the instructions to not provide proper services (or deny providing services) to the targeted people?
  11. Is such information being used to ensure that the targeted people are denied jobs that they apply for? Is it being used to form a kind of (formal or informal) blacklist for employment and related purposes? Is it also being used to create hindrances in the work of these people, if they do get a job.
  12. What is extent of the use of surveillance of any kind in academics? What is the purpose of such surveillance? Are students being involved in such activities as developers, system administrator and informers in general? What are the details of surveillance related projects sanctioned by the government specifically for academic institutions?
  13. To what extent are the communications service providers being used for surveillance, whether for the government or for corporations or for any other organisations?
  14. Does the government know about the use of surveillance devices by the large right-wing organisations and corporations/institutions sympathetic to them? If yes, have any steps being taken to stop this? Has there been any investigation into this?
  15. In case the answer to most of the questions above is negative, is there any mechanism to take action in case evidence is made available that would indicate that the answer to at least a few of these questions may be affirmative?

I have written the above only as initial notes. These can be refined and improved and extended. I would welcome any suggestions.

Full Disclosure: I am writing this as a person who believes that he has been a target of such practices for the last many years, although I don’t even claim to have indulged in much protest of any major significance. I am writing this almost as a last resort, having tried to ignore this issue for a long time, hoping that it would cease in due course. I don’t know what else I can do about this. Please note that being part of the ‘IT community’ in India, I am both more prone to it and also more likely to notice it.

I know how some people are going to react to it, but unless I thought it absolutely necessary (a matter of life and death), I wouldn’t have written it. I am generally not given to stick my head out easily, though I do try to call a spade a spade. I am no Bradley Manning. But I guess my head is already out.

The Missing Clause

There is a legal agreement written in very legal language that I had to read today. It’s called Mutual Confidentiality Agreement and is required to be signed by two parties who plan to collaborate on some commercial product or service.

After having plodded through the legalese and having understood most of it (I have an advantage in this regard), I found that there was one clause that was glaringly missing from it.

The document lists all the conditions that apply when the Disclosing Party discloses something to the Recipient. It has a section euphemistically titled ‘Injunctive Relief’ that might send the shivers down the Recipient’s spine, depending on the power balance. It also lists all the exceptions under which these conditions may not apply. Such conditions include “court order” and “as required by law”.

What is missing is something that should be included in all such documents post-9/11, in all countries that went for the security Gold Rush, which practically means all countries, (almost) period.

That missing clause should go something like this:

An (unintended) disclosure by the Recipient to any number of third parties of any of the Disclosing Party’s Confidential Information will not be considered a breach of the agreement if it happens under any of the following conditions:

  1. As part of surveillance operations carried out by the State and any of its agencies, the institution in which the Recipient works or any part thereof, the Local Version of the Truman Show, the Connectivity Service Providers, the Private Security Companies, the Local Quasi-authorised Vigilante Organisations or any other such agencies added to the list till the eve of the day the breach is considered for scrutiny.
  2. [Talking of eve] As a result of eavesdropping by the agencies and organisations listed in 1.
  3. As a result of disclosure by the people involved in (a) surveillance and (b) eavesdropping by the agencies and organisations listed in 1 to any of their superiors, colleagues, sub-ordinates, business associates, friends, relatives, family members or strangers.

The clause sounds very reasonable in the post-9/11 world and makes perfect legal sense. After all, any disclosure made (unintentionally) under conditions listed in this clause would not be the fault of the Recipient and it would only be for The Good of The Country and The World and The Humanity (as everyone knows and agrees to).

I have one doubt, however. Won’t the addition of this clause almost nullify everything else in this agreement to mutual confidentiality?

But the clause is required. Isn’t it?

And what about that poor thing, The Market?

Is it already being forgotten in favour of other things?

Sicilian Grand Prix Attack?

There is a website that I have, which has been inoperative for some time. There was not much content on it anyway. However, while working on another site located on the same server, I noticed that the site was being accessed heavily, but since it is inoperative, the web server is logging the errors.

This started on May 11th, 2011. The error log has become huge by the standards of any website that I maintain. It’s size is 8 MB. It has more than 60000 entries, most of them being for the inoperative site I mentioned. And the total number of *distinct* IPs from which the site has been accessed is nearly 20000: way beyond the traffic that I get for even those sites which are operative and regularly used.

Two of the entries in log file indicated that someone had posted a link to download a free book called ‘Starting Out: The Sicilian Grand Prix Attack’. But there has been no facility to add comments for this site on this server, although there was on the server where the site was earlier installed. So perhaps the cached post was from the earlier server.

The important thing is, there were only two requests for this post or this link to the book.

But then I searched for it on Google and saw the cached post about an hour ago.

From a few minutes after that, there is a flood of requests for the same link to the book on Sicilian Grand Prix Attack, even though the site is still inoperative. There are also more attempts to add new comments.

The ‘attack’ seems to continue and the size of the error log file is growing even now.

Meaning what? You tell me.

[Some information that might perhaps be relevant: The site was about a query language that I have designed. I had submitted a paper about this language to a workshop at a very prestigious conference. The paper was rejected. I received the notification on the 7th. Over the next two days, I had an exchange of emails with the PC chairs and the organizers of the workshop about my dissatisfaction with the reviews and the reviewing process. I also asked them to forward my comments to the reviewers. I could be identified from my comments, even if my name had been removed.]

(There are many simple explanations of the above. One of them is that the writer is a moron.)

Drones, Aerial and Otherwise

[This was meant to be a comment in reply to an article on the ZNet by Pervez Hoodbhoy about aerial drones and what he calls ‘human drones’.]

I feel very strange, in fact disturbed, to have to make this comment, as this comment is critical of the ideas of someone with whom I have a lot in common, whereas I have almost nothing in common with those he proposes should be killed by any means possible. The strangeness also comes from the fact that the author not only recognizes but has actually been writing about the grounds on which I will put forward my criticism.

I am not sure whether Pervez Hoodbhoy is one or not, but I am an unapologetic atheist and have almost the worst possible opinion about religious fundamentalism of any kind, especially when it is of the organised kind or has organisational support. I also have no hesitation in stating that there IS something that can be called Islamic Fascism and it should be called by its proper name. But I also recognize that often things get mixed up and we can have a resistance movement that is also a Fascist movement. That makes it difficult to analyze them, let alone judge them. We can, however, still analyze and judge specific facts and events and be mostly right about them if we have sufficient evidence and we make sure that we keep our intellectual integrity intact.

Thus, for example, the people who are being targeted by the American drones (excluding those caught in the ‘collateral damage’) have been doing things which no sensible human being can support. These include the horrible terrorist acts, but more importantly (as the author rightly points out) they include their atrocities on their own people: women, protesters of any kind, ‘blasphemers’ etc. I can very well see what would happen to me if I were living in that kind of society.

I also share most of what the author has been saying. The trouble is that, he also makes some leaps of logic or conclusions which seem patently wrong to me and I think I have to register my disagreement with them, because they are far too important to be ignored.

I could, perhaps, write a longer article about it, but for now I will try to say a few things which matter more to me.

The first problem is that the author mixes up the literal and the metaphorical and this logical error leads him to atrocious conclusions. We can surely talk about ‘human drones’ where we are using the word drone metaphorically and the usage is justified as he has eloquently explained by comparing them with the non-human aerial drones. But the comparison itself is metaphorical. And the justification does not remain valid when he goes on to establish a straightforward literal equivalence. The ‘human drones’ might be brutal, unthinking, destructive, (metaphorical) killing machines and so on. They might be, in a sense, inhuman or anti-human, but they still are not non-human. They do have bodies, minds and thoughts. To say otherwise is to abandon one’s thinking in a fit of rage. What they deserve or not may be a matter of debate, but it has to be based on a vision that does not ignore the fact that they still are human beings, however detestable and dangerous they may be.

I am sure the author is aware of some of the history of the world which seems to indicate that there were a lot other people – and still are – who might also be justifiably called ‘human drones’ and who might be considered as bad as the ones he is talking about. That definitely can’t justify their actions, but it has a bearing on what those taking up the task of judging them should think and do.

If you agree with my contention here, then the analysis will lead to different directions. What those directions exactly should be, I won’t go into, because I don’t claim to have the answers, but they would lead to conclusions different from those of the author.

Even the metaphorical comparison here has some problems, which can, as I said, be guessed from what the author himself has been writing. There are some similarities, but there are also many differences. The ‘human drones’ still come from a certain society and they are part of it. The aerial drones are just machines, they don’t come from any society. The ‘human drones’ come from societies which have seen destruction of the worst kind for ages, whereas the aerial drones are (literally) remote controlled by those who played the primary role in bringing about this destruction, as the author himself has written and said elsewhere. If you ignore these facts, you will again be lead to very risky (and I would say immoral and unfair) conclusions.

With just a little dilution of the metaphor, haven’t most of the weapon laden humans (soldiers, commandos etc.) been kinds of ‘human drones’? The ones author talks about may be deadlier, but the situation is more drastic too. On the one hand you have an empire that is more powerful than any in history and on the other you have an almost primitive society that thinks it is defending itself, just as the empire says it is defending itself. Will it be improper to ask who has got more people killed? What about the ‘human drones’ of the empire: thinking of, say Iraq?

As far as I can see, the use of aerial drones to kill people, whoever they may be, is simply indefensible. Because if their use is justified on the grounds of the monstrosities of the Taliban ideologues and operators, what about chemicals? If some people were to form an anti-Taliban group and they were to infiltrate the ‘affected’ villages and towns and if they were to use poisoning of the water supply or something similar to kill people in the areas where these monsters are suspected to be, would that be justifiable? The aerial drones are, after all, just a technological device. There can be other such technologies and devices.

There must have been some very solid reasons why the whole world agreed to ban the use of chemical and biological weapons after the first world war and stuck to that ban (with a few universally condemned exceptions), though they were very effective and the Nazis were very evil.

The other big problem I have with the author’s opinions on this matter is that he suggests that the American aerial drones are one of the unsavoury weapons we should use against the fundamentalist Islamic militants. This is a logical error as well as a moral one. The logical error is that ‘we’ are not using the weapons at all, the empire is using them. And it is the same empire that created the problem in the first place, once again as the author himself has said. We have no control over how these drones are or will be used and who they will be used against in the future. Can’t they, some day in the future, be used against ‘us’? Why not? Perhaps the empire won’t use them directly, but it can always outsource their use: think again of Iraq. Iraq of the past and Iraq of the present. The author, in fact, knows very well the other examples that I could give.

To put it in Orwell’s words, make a habit of imprisoning Fascists without trial, and perhaps the process won’t stop at Fascists.

The use of aerial drones, they being just a technological device, might perhaps be justifiable for certain purposes, for example in managing relief work during large scale natural disasters, e.g. the wild fires in Russia or the frequent floods in India and China (but not as just a cover for their more sinister use). Their use for killing humans is, however, of a completely different nature.

The moral error is that the author’s conclusions unambiguously imply that ends justify the means. As long as these monsters producing (or becoming) ‘human drones’ are killed, it doesn’t matter whether the weapons are, to use the author’s word, unsavory. It also doesn’t matter that they are being used by an empire ‘we’ are opposed to and which started the mess. (Actually, the mess was started long ago by another empire, but then we could say there were even older empires who played a role in creating this mess, so let’s not go into that).

I even sort of agree with the author’s idea that recommending the standard left meta-technique of “mobilizing” people (actually, it is not just leftists who use such techniques) may not be very practical under the conditions prevalent in this case. But, as I said, though I understand the severity of the problems, I don’t have the solutions. I only want to say that the kind of errors that the author makes can lead us to a worse situation. We should not forget (I am sure the author knows this too) that it is not just a case of some bad apples. Even if these were to be removed by using ‘unsavory’ forces and weapons, the problems are not going to be solved so easily. Because there is not just one clearcut problem but many problems which are all meshed together and the meshing is too complex and barely visible.

At the risk of making an unpalatable statement, I would say that if any party in conflicts like this has to be excused for using unsavory weapons or tactics, it will have to be the much weaker party, not the strongest party in history. But I don’t think I would include suicide bombing among those weapons or tactics. And I also realize the limits to which I can be entitled to sit in judgement over people living under such conditions.

The author need not offer me (business class or mere economy) tickets to Waziristan. I am scared to even go to places in India.

One more problem that I have with the author’s writings is that he seems to have assigned blame to most parties involved in the conflict: the Army, the militants, the Taliban, the Americans etc., but has he (I haven’t read everything written by him) considered, equally critically, the role of the Pakistani elite (not just the leftists) and the somewhat ‘secular’ middle class? He seems to have hinted at their role, but it seems to me that their role was, is and will be far more critical in determining what is happening and what will happen. After all, the rise (if we can call it that) of the Taliban closely parallels the Islamisation of the Pakistani society in general. How did the Pakistani elite (intellectual, feudal and official) help in this and what can they do to solve this problem?

That, it seems to me, is the crucial question to ask (though it won’t lead to a quick fix), apart from what people around the world can do about those controlling the aerial drones, towards whom, as the author earlier wrote, “we still dare not point a finger at”. After going on to point a finger at them, the author seems to have now moved to the position of accepting their support in terms of killings by the aerial drones in order to contain the ‘human drones’, which (to be a bit harsh) doesn’t make sense to me.

Related to this is another question: does the natural antipathy of the Pakistani elite towards these ‘primitive’ tribal communities has something to do with the position that the author has taken and which he says many others (‘educated people’) share?

There are, of course, other actors. The author has mentioned Saudi Arab, but Iran has a role. Even India has (or at least wants to have) a role.

But I want to end on a positive note. It’s heartening to see that the ZNet allows this kind of a dissenting view to be presented on its platform. That should be a good sign for the discussion.

[Unfortunately, I have to end on a slightly negative note. As I was going to add the comment to the article, I realized that I have to be a ‘sustainer’ even to post a comment. And I have not been able to become a sustainer for reasons I won’t go into here. Hence I post it here.]

सांगणिक भाषाविज्ञान

जैसा मैंने पिछली प्रविष्टी (‘पोस्ट’ के लिए यह शब्द इस्तेमाल हो सकता है?) में लिखा था, अगले कुछ हफ्तों में मैं संचय के बारे में लिखने जा रहा हूं।

लेकिन क्योंकि संचय खास तौर पर (आम उपयोक्ताओं के अलावा) सांगणिक भाषाविज्ञान या भाषाविज्ञान के शोधकर्ताओं के लिए बनाया गया है, इस बात को साफ कर देना ठीक रहेगा कि सांगणिक भाषाविज्ञान या भाषाविज्ञान के माने क्या है, या अगर आप इनके माने जानते ही हैं तब भी इनसे मेरा अभिप्राय क्या है। यह दूसरी बात इसलिए कि इन विषयों (सांगणिक भाषाविज्ञान या भाषाविज्ञान) के अर्थ के बारे में आम लोगों में तो तमाम तरह की ग़लतफ़हमियाँ हैं ही, पर इन विषयों के शोधकर्ताओं में भी इनकी परिभाषा पर एक राय नहीं है।

सच तो यह है कि हिंदी जगत में तो अब भी अधिकतर लोग भाषाविज्ञान का अर्थ उस तरह के अध्ययन से लगाते हैं जो पिछली सदी के शुरू में लगाया जाता था। लेकिन बहस की इस दिशा में अभी मैं नहीं जाना चाहूंगा क्योंकि इसके बारे में कहने को इतना अधिक है कि अभी जो उद्देश्य है वो पीछे ही रह जाएगा।

वैसे सांगणिक भाषाविज्ञान या भाषाविज्ञान की परिभाषा या उनकी सीमाओं के बारे में भी कहने को बहुत-बहुत कुछ है, पर फिलहाल थोड़े से ही काम चलाया जा सकता है।

तो छोटे में कहा जाए तो भाषाविज्ञान शोध या अध्ययन का वह विषय है जिसमें किसी एक भाषा के व्याकरण का ही अध्ययन नहीं किया जाता बल्कि नैसर्गिक या मानुषिक (यानी कृत्रिम नहीं) भाषा का वैज्ञानिक रूप से अध्ययन किया जाता है। अब यह धारणा व्यापक रूप से स्वीकृत है कि मानव मस्तिष्क की संरचना का भाषा की संरचना से सीधा संबंध है और क्योंकि सभी मानवों के मस्तिष्क की संरचना मूलतः एक ही जैसी है, तो सभी नैसर्गिक या मानुषिक भाषाओं में भी सतही लक्षणों को छोड़ कर बाकी सब एक ही जैसा है। इसीलिए, जैसा कि इन विषयों के आधुनिक साहित्य में प्रसिद्ध है, अगर किसी अमरीकी के शिशु को जन्म के तुरंत बाद कोई चीनी परिवार गोद ले ले और वह बच्चा चीन में ही पले तो वह उतनी आसानी से चीनी बोलना सीखेगा जितनी आसानी से कोई चीनी परिवार का बच्चा। ऐसी ढेर सारी और बातें हैं, पर मुख्य बात है कि भाषाविज्ञान नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन है।

कम से कम कोशिश तो यही है कि अध्ययन वैज्ञानिक रहे, पर वो वास्तव में रह पाता है या नहीं, यह बहस का विषय है।

अब सांगणिक भाषाविज्ञान पर आएं तो इस विषय में हमारा ध्यान मानवों की बजाय संगणक यानी कंप्यूटर पर आ जाता है, पर पिछली शर्त फिर भी लागू रहती है: नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन। अंतर यह है कि हमारा उद्देश्य अब यह हो जाता है कि कंप्यूटर को इस लायक बनाया जा सके कि वो नैसर्गिक या मानुषिक भाषा को समझ सके और उसका प्रयोग कर सके। जाहिर है यह अभी बहुत दूर की बात है और इसमें कोई आश्चर्य भी नहीं होना चाहिए क्योंकि अभी भाषाविज्ञान में ही (पिछली सदी की असाधारण उपलब्धियों के बाद भी) वैज्ञानिक ढेर सारी बाधाओं में फंसे हैं।

फिर भी, सांगणिक भाषाविज्ञान में काफ़ी कुछ संभव हो चुका है और काफ़ी कुछ आगे (निकट भविष्य में) संभव हो सकता है। लेकिन इसमें कंप्यूटर का मानव जैसे भाषा बोलना-समझना शामिल नहीं है। जो शामिल है वो हैं ऐसी तकनीक जो दस्तावेजों को ज़्यादा अच्छी तरह ढूंढ सकें, उनका सारांश बना सकें, कुछ हद तक उनका अनुवाद कर सकें आदि।

लेकिन हिंदुस्तानी परिप्रेक्ष्य में परेशानी यह है कि हम अभी इस हालत में भी नहीं पहुंचे हैं कि आसानी से कंप्यूटर का एक बेहतर टाइपराइटर की तरह ही उपयोग कर सकें। इस दिशा में कुछ उपलब्धियाँ हुई हैं, पर अंग्रेज़ी या प्रमुख यूरोपीय भाषाओं की तुलना में हम कहीं भी नहीं हैं। जैसा कि आपमें से अधिकतर जानते ही हैं, यह एक लंबी कहानी है जिसे अभी छोड़ देना ही ठीक है।

पर संचय का विकास इसी परिप्रेक्ष्य में किया गया है, जिसके बारे में आगे बात करेंगे।

संचय का परिचय

पिछली पोस्ट (शर्म के साथ कहना पड़ रहा है कि पोस्ट के लिए कोई उपयुक्त शब्द नहीं ढूंढ पा रहा हूं) में मैंने (अंग्रेज़ी में) संचय के नये संस्करण के बारे में लिखा था। मज़े की बात है कि संचय के बारे में मैंने अभी हिंदी में शायद ही कुछ लिखा हो। इस भूल को सुधारने की कोशिश में अब अगले कुछ हफ्तों में संचय के बारे में कुछ लिखने का सोचा है।

तो संचय कौन है? या संचय क्या है?

पहले सवाल का तो जवाब (अमरीकी शब्दावली में) यह है कि संचय एक सिंगल पेरेंट चाइल्ड है जिसे किसी वेलफेयर का लाभ तो नहीं मिल रहा पर जिस पर बहुत सी ज़िम्मेदारियाँ हैं।

दूसरे सवाल का जवाब यह है कि संचय सांगणिक भाषाविज्ञान (कंप्यूटेशनल लिंग्विस्टिक्स) या भाषाविज्ञान के क्षेत्र में काम कर रहे शोधकर्ताओं के लिए उपयोगी सांगणिक औजारों का एक मुक्त (मुफ्त भी कह सकते हैं) तथा ओपेन सोर्स संकलन है। पर खास तौर से यह कंप्यूटर पर भारतीय भाषाओं का उपयोग करने वाले किसी भी व्यक्ति के काम आ सकता है। इसकी एक विशेषता है कि इसमें नयी भाषाओं तथा एनकोडिंगों को आसानी से शामिल किया जा सकता है। लगभग सभी प्रमुख भारतीय भाषाएं इसमें पहले से ही शामिल हैं और संचय में उनके उपयोग के लिए ऑपरेटिंग सिस्टम पर आप निर्भर नहीं है, हालांकि अगर ऑपरेटिंग सिस्टम में ऐसी कोई भी भाषा शामिल है तो उस सुविधा का भी आप उपयोग संचय में कर सकते हैं। यही नहीं, संचय का एक ही संस्करण विंडोज़ तथा लिनक्स/यूनिक्स दोनों पर काम करता है, बशर्ते आपने जे. डी. के. (जावा डेवलपमेंट किट) इंस्टॉल कर रखा हो। यहाँ तक कि आपकी भाषा का फोंट भी ऑपरेटिंग सिस्टम में इंस्टॉल होना ज़रूरी नहीं है।

संचय का वर्तमान संस्करण 0.3.0 है। इस संस्करण में पिछले संस्करण से सबसे बड़ा अंतर यह है कि अब एक ही जगह से संचय के सभी औजार इस्तेमाल किए जा सकते हैं, अलग-अलग स्क्रिप्ट का नाम याद रखने की ज़रूरत नहीं है। कुल मिला कर बारह औजार (ऐप्लीकेशंस) शामिल किए गए हैं, जो हैं:

  1. संचय पाठ संपादक (टैक्सट एडिटर)
  2. सारणी संपादक (टेबल एडिटर)
  3. खोज-बदल-निकाल औजार (फाइंड रिप्लेस ऐक्सट्रैक्ट टूल)
  4. शब्द सूची निर्माण औजार (वर्ड लिस्ट बिल्डर)
  5. शब्द सूची विश्लेषण औजार (वर्ड लिस्ट ऐनेलाइज़र ऐंड विज़ुअलाइज़र)
  6. भाषा तथा एनकोडिंग पहचान औजार (लैंग्वेज ऐंड एनकोडिंग आइडेंटिफिकेशन)
  7. वाक्य रचना अभिटिप्पण अंतराफलक (सिन्टैक्टिक ऐनोटेशन इंटरफेस)
  8. समांतर वांगमय अभिटिप्पण अंतराफलक (पैरेलल कोर्पस ऐनोटेशन इंटरफेस)
  9. एन-ग्राम भाषाई प्रतिरूपण (एन-ग्राम लैंग्वेज मॉडेलिंग टूल)
  10. संभाषण वांगमय अभिटिप्पण अंतराफलक (डिस्कोर्स ऐनोटेशन इंटरफेस)
  11. दस्तावेज विभाजक (फाइल स्प्लिटर)
  12. स्वचालित अभिटिप्पण औजार (ऑटोमैटिक ऐनोटेशन टूल)

अगर इनमें से अधिकतर का सिर-पैर ना समझ आ रहा हो तो थोड़ा इंतज़ार करें। आगे इनके बारे में अधिक जानकारी देने की कोशिश रहेगी।

शायद इतना और जोड़ देने में कोई बुराई नहीं है कि संचय पिछले कुछ सालों से इस नाचीज़ के जिद्दी संकल्प का परिणाम है, जिसमें कुछ और लोगों का भी सहयोग रहा है, चाहे थोड़ा-थोड़ा ही। उन सभी लोगों के नाम संचय के वेबस्थल पर जल्दी ही देखे जा सकेंगे। ये लगभग सभी विद्यार्थी हैं (या थे) जिन्होंने मेरे ‘मार्गदर्शन’ में किसी परियोजना – प्रॉजेक्ट – पर काम किया था या कर रहे हैं।

उम्मीद है कि संचय का इससे भी अगला संस्करण कुछ महीने में आ पाएगा और उसमें और भी अधिक औजार तथा सुविधाएं होंगी।

Good News and Bad News on the CL Front

First, as the saying goes, the bad news. We had submitted a proposal for the Second Workshop on NLP for Less Privileged Languages for the ACL-affiliated conferences. That proposal has not been accepted. Total proposals submitted were 41 and 34 out of them were accepted. Ours was among the not-accepted seven (euphemisms can be consoling).

Was is that bad? I hope not.

Don’t those capital letters look silly in the name of a rejected proposal?

Now the good news. The long awaited new version of Sanchay has been released on Sourceforge. (Well, at least I was awaiting). This version has been named (or numbered?) 0.3.0.

The new Sanchay is a significant improvement over the last public version (0.2). It now has one main GUI from which all the applications can be controlled. There are twelve (GUI based) applications which have been included in this version. These are:

  • Sanchay Text Editor that is connected to some other NLP/CL components of Sanchay.
  • Table Editor with all the usual facilities.
  • A more intelligent Find-Replace-Extract Tool (can search over annotated data and allows you to see the matching files in the annotation interface).
  • Word List Builder.
  • Word List FST (Finite State Transducer) Visualizer that can be useful for anyone working with morphological analysis etc.
  • One of the most accurate Language and Encoding Identifier that is currently trained for 54 langauge-encoding pairs, including most of the major Indian languages. (Yes, I know there is a number agreement problem in the previous sentence).
  • A user friendly Syntactic Annotation Interface that is perhaps the most heavily used part of Sanchay till now. Hopefully there will be an even more user friendly version soon.
  • A Parallel Corpus Annotation Interface, which is another heavily used component. (Don’t take that ‘heavily’ too seriously).
  • An N-gram Language Modeling Tool that allows you to compile models in terms of bytes, letters and words.
  • A Discourse Annotation Interface that is yet to be actually used.
  • A more intelligent File Splitter.
  • An Automatic Annotation tool for POS (Part Of Speech) tagging, chunking and Named Entity Recognition. The first two should work reasonably well, but the last one may not be that useful for practical purposes. This is a CRF (Conditional Random Fields) based tool and it has been trained for Hindi for these three purposes. If you have annotated data, you can use it to train your own taggers and chunkers.

All these components use the customizable language-encoding support, especially useful for South Asian languages, that doesn’t need any support from the operating system or even the installation of any fonts, although these can still be used inside Sanchay if they are there.

More information is available at the Sanchay Home.

The capitals don’t look so bad for a released version.

The downside of even this good news is that my other urgent (to me) work has got delayed as I was working almost exclusively on bringing out this version for the last two weeks or so.

But then you need a reason to wake up and Sanchay is one of my reasons. And I can proudly say that a half-hearted attempt to generate funding for this project by posting it on Micropledge has generated 0$.

Sanchay is still alive as a single parent child without any welfare but with a lot of responsibilities.

Now I can have nightmares about the bugs.