Asmai: (Al'asma'i) Arabic semantic analysis
===========================================

مكتبة الأصمعي الدلالية
======================

Asmai: (Al'asma'i) Arabic semantic analysis library for Python

.. figure:: doc/asmai_header.png
   :alt: asmai logo

   asmai logo

.. figure:: https://img.shields.io/pypi/dm/asmai
   :alt: PyPI - Downloads

   PyPI - Downloads

Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail
dot com

+-------------+--------------------------------------------------------------------------------------------+
| Features    | value                                                                                      |
+=============+============================================================================================+
| Authors     | `Authors.md <https://github.com/linuxscout/asmai-arabic-semantic/master/AUTHORS.md>`__     |
+-------------+--------------------------------------------------------------------------------------------+
| Release     | 0.1                                                                                        |
+-------------+--------------------------------------------------------------------------------------------+
| License     | `GPL <https://github.com/linuxscout/asmai-arabic-semantic/master/LICENSE>`__               |
+-------------+--------------------------------------------------------------------------------------------+
| Tracker     | `linuxscout/asmai/Issues <https://github.com/linuxscout/asmai-arabic-semantic/issues>`__   |
+-------------+--------------------------------------------------------------------------------------------+
| Source      | `Github <http://github.com/linuxscout/asmai-arabic-semantic>`__                            |
+-------------+--------------------------------------------------------------------------------------------+
| Feedbacks   | `Comments <https://github.com/linuxscout/asmai-arabic-semantic/>`__                        |
+-------------+--------------------------------------------------------------------------------------------+
| Accounts    | [@Twitter](https://twitter.com/linuxscout)                                                 |
+-------------+--------------------------------------------------------------------------------------------+

Description
-----------

Asmai: (Al'asma'i) Arabic semantic analysis library for Python

مزايا:
~~~~~~

-  استخلاص ثنائيات الكلمات التي تحمل دلالات من نوع : (فاعلية، مفعولية،
   إضافة)


install
~~~~~~~

.. code:: shell

    pip install asmai

Usage
~~~~~

import
^^^^^^

.. code:: python

    pip install asmai

Test
^^^^

.. code:: python

    import asmai.anasem as asm
    text  =  u"يعبد الله منذ أن تطلع الشمس"
    result  =  []
    anasem  =  asm.SemanticAnalyzer()    
    result  =  anasem.analyze_text(text)
    # the result contains objets
    anasem.pprint(result)

-  Extract semantic relation, display only found relations

.. code:: python

    >>> import pprint
    >>> sem_result = anasem.display_sem(result)
    >>> pprint.pprint(sem_result)      
    [[['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
      ['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
      ['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
      ['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
      ['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
      ['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']]]

-  Extract semantic relation, display all words and tags

   .. code:: python

       >>> sem_result = anasem.display_sem(result, all=True)
       >>> pprint.pprint(sem_result)
       [('يعبد', 'O', []),
        ('الله', 'O', []),
        ('منذ', 'O', []),
        ('أن', 'O', []),
        ('تطلع', 'B', []),
        ('الشمس',
         'I',
         [['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
          ['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
          ['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
          ['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
          ['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
          ['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']])]
       >>> 

-  convert to pandas \`\`\`python >>> import pandas as pd >>> >>> #
   flatten the result ... df = pd.DataFrame(anasem.decode(result)) >>>
   print(df.head()) action affix affix\_key forced\_word\_case ...
   unvocalized unvoriginal vocalized word 0 -ي-- -ي--\|المضارع
   المنصوب:هو:y False ... يعبد عبد يُعَبِّدَ يعبد 1 -ي-- -ي--\|المضارع
   المجهول المجزوم:هو:y False ... يعبد عبد يُعَبَّدْ يعبد 2 -ي--
   -ي--\|المضارع المجهول:هو:y False ... يعبد عبد يُعَبَّدُ يعبد 3 -ي--
   -ي--\|المضارع المعلوم:هو:y False ... يعبد عبد يُعَبِّدُ يعبد 4 -ي--
   -ي--\|المضارع المجزوم:هو:y False ... يعبد عبد يُعَبِّدْ يعبد

[5 rows x 50 columns] >>> df.to\_csv("output/test.csv", encoding="utf8",
sep=":raw-latex:'\t'")



[requirement]
^^^^^^^^^^^^^

::

    1- pyarabic 
    2. sqlite
    3. sylajone

Data Structure:
---------------

Semantic database
~~~~~~~~~~~~~~~~~

.. code:: sql

    CREATE TABLE sqlite_sequence(name,seq);
    CREATE TABLE "derivations" (
        "id" INTEGER PRIMARY KEY  AUTOINCREMENT  NOT NULL  UNIQUE ,
        "verb" varchar NOT NULL ,
        "transitive" BOOL NOT NULL  DEFAULT 1,
        "derived" VARCHAR NOT NULL ,
        "type" VARCHAR NOT NULL 
     );

CSV Structure:

-  Derivattion

1. id : id unique in the database
2. verb : vocalized collocation
3. transtive : if the verb is transitive
4. derived : derived word from verb number
5. type : type

Semantic relations
^^^^^^^^^^^^^^^^^^

.. code:: sql

    CREATE TABLE "relations" (
        "id" INTEGER PRIMARY KEY  NOT NULL ,
        first" VARCHAR NOT NULL  DEFAULT ('') ,
        "second" VARCHAR NOT NULL  DEFAULT ('') ,
        "rule" VARCHAR NOT NULL  DEFAULT (0) 
     );

CSV Structure:

1. id : id unique in the database
2. first: first word
3. second: second word
4. rule : the extraction rule number :


