Index: [Article Count Order] [Thread]

Date: Tue, 12 Oct 2004 06:05:20 +0900 (JST)
From: GOTOU Yuuzou <gotoyuzo@notwork.org>
Subject: [webrickja:121] Re: Accept-Language: の対応
To: webrickja@notwork.org
Message-Id: <20041012.060520.28780163.gotoyuzo@sawara.does.notwork.org>
In-Reply-To: <m3vfdmq4kc.wl@namazu.org>
References: <m33c0swbj8.wl@namazu.org>	<20041007.171137.607952698.gotoyuzo@sawara.does.notwork.org>	<m3vfdmq4kc.wl@namazu.org>
X-Mail-Count: 00121

In message <m3vfdmq4kc.wl@namazu.org>,
 `Satoru Takabayashi <satoru@namazu.org>' wrote:
> GOTOU Yuuzou:
> > #{DocumentRoot}/foo.htmlと#{DocumentRoot}/foo.html.jaがある
> > ときの/foo.htmlへのアクセスは、#{DocumentRoot}/foo.html.jaが
> > 優先されてもいいような気がするのですが、そういうものではない
> > んですね。
> 
> はい。 Apache では foo.html が優先されるので真似しました。

foo.html.jaを指定された時に、foo.html.ja.{ja,en,..}を探すの
を嫌ったんですかねえ。真似するのが無難な気はしますが。

> > > もしよければ、WEBrick 本体の FileHandler で Accept-Language:
> > > に対応していただけると助かります。FileHandler のオプションに
> > > :AcceptLanguage => true を指定するとAccept-Language を見る、
> > > といった感じで指定できると便利だと思います。
> > 
> > いつも有効だと困るケースってありますかねえ。
> 
> foo.html が存在しないときに foo.html.{en,ja,...} の存在をいち
> いち確認するコストがもったいないと考える人はいるかもしれませ
> ん。

なるほど。

> WEBrick 本体をいじらずに Accept-Language に対応するのはやはり
> 少々無理があるようです。

そうですね。すこし考えて、ファイルを検索する部分を

  WEBrick::HTTPServlet::FileHandler#search_file(req, res, basename)

というメソッドに切り出してみました。basenameには探索中のパス
の要素が渡され、このメソッドがnilでない文字列を返すと、
script_name中のbasenameにあたる部分が、その値に置き換えられ
ます。

例えば、Request-URIとして/foo/bar.cgi/bazが与えられた時に、
/fooにあたるディレクトリが存在すると、

  search_file(req、res, "/bar.cgi")

として呼び出されます。ここで、search_fileが実在するファイル
名(例えば"/bar.cgi.en")を返すと、

  res.filename    = #{DirectoryIndex}/foo/bar.cgi.en
  req.script_name = /foo/bar.cgi.en
  req.path_info   = /baz

となります。

ついでにContent-Type:のCharsetパラメータを生成する仕組みも欲
しいのですが、うまい実装を思い付きませんでした。

-- 
ごとうゆうぞう

Index: lib/webrick/config.rb
===================================================================
RCS file: /var/cvs/src/ruby/lib/webrick/config.rb,v
retrieving revision 1.4
diff -u -p -F^[^A-Za-z0-9_+-]*\(class\|module\|def\)[^A-Za-z0-9_+-] -r1.4 config.rb
--- lib/webrick/config.rb	11 Mar 2004 22:36:11 -0000	1.4
+++ lib/webrick/config.rb	11 Oct 2004 21:02:01 -0000
@@ -72,6 +72,7 @@   module Config
       :DirectoryCallback => nil,
       :FileCallback      => nil,
       :UserDir           => "public_html",
+      :AcceptableLanguages => []  # ["en", "ja", ... ]
     }
 
     BasicAuth = {
Index: lib/webrick/httprequest.rb
===================================================================
RCS file: /var/cvs/src/ruby/lib/webrick/httprequest.rb,v
retrieving revision 1.5
diff -u -p -F^[^A-Za-z0-9_+-]*\(class\|module\|def\)[^A-Za-z0-9_+-] -r1.5 httprequest.rb
--- lib/webrick/httprequest.rb	20 Dec 2003 13:01:33 -0000	1.5
+++ lib/webrick/httprequest.rb	11 Oct 2004 21:02:01 -0000
@@ -32,6 +32,8 @@   class HTTPRequest
 
     # Header and entity body
     attr_reader :raw_header, :header, :cookies
+    attr_reader :accept, :accept_charset
+    attr_reader :accept_encoding, :accept_language
 
     # Misc
     attr_accessor :user
@@ -56,6 +58,8 @@     def initialize(config)
       @raw_header = Array.new
       @header = nil
       @cookies = []
+      @accept = @accept_charset =
+        @accept_encoding = @accept_language = nil
       @body = ""
 
       @addr = @peeraddr = nil
@@ -83,6 +87,10 @@     def parse(socket=nil)
         @header['cookie'].each{|cookie|
           @cookies += Cookie::parse(cookie)
         }
+        @accept = HTTPUtils.parse_qvalues(self['accept'])
+        @accept_charset = HTTPUtils.parse_qvalues(self['accept-charset'])
+        @accept_encoding = HTTPUtils.parse_qvalues(self['accept-encoding'])
+        @accept_language = HTTPUtils.parse_qvalues(self['accept-language'])
       end
       return if @request_method == "CONNECT"
       return if @unparsed_uri == "*"
@@ -122,6 +130,14 @@     def query
         parse_query()
       end
       @query
+    end
+
+    def content_length
+      return Integer(header['content-length'])
+    end
+
+    def content_type
+      return header['content-type']
     end
 
     def [](header_name)
Index: lib/webrick/httpresponse.rb
===================================================================
RCS file: /var/cvs/src/ruby/lib/webrick/httpresponse.rb,v
retrieving revision 1.4
diff -u -p -F^[^A-Za-z0-9_+-]*\(class\|module\|def\)[^A-Za-z0-9_+-] -r1.4 httpresponse.rb
--- lib/webrick/httpresponse.rb	22 Dec 2003 21:13:06 -0000	1.4
+++ lib/webrick/httpresponse.rb	11 Oct 2004 21:02:01 -0000
@@ -63,6 +63,24 @@     def []=(field, value)
       @header[field.downcase] = value.to_s
     end
 
+    def content_length
+      if len = @header['content-length']
+        return Integer(len)
+      end
+    end
+
+    def content_length=(len)
+      @header['content-length'] = len.to_s
+    end
+
+    def content_type
+      @header['content-type']
+    end
+
+    def content_type=(type)
+      @header['content-type'] = type
+    end
+
     def each
       @header.each{|k, v|  yield(k, v) }
     end
@@ -250,7 +268,7 @@     def send_body_io(socket)
         _write_data(socket, "0#{CRLF}#{CRLF}")
       else
         size = @header['content-length'].to_i
-        _send_file(socket, @body, 0, size.to_i)
+        _send_file(socket, @body, 0, size)
         @sent_size = size
       end
       @body.close
Index: lib/webrick/httputils.rb
===================================================================
RCS file: /var/cvs/src/ruby/lib/webrick/httputils.rb,v
retrieving revision 1.6
diff -u -p -F^[^A-Za-z0-9_+-]*\(class\|module\|def\)[^A-Za-z0-9_+-] -r1.6 httputils.rb
--- lib/webrick/httputils.rb	13 Aug 2004 04:11:30 -0000	1.6
+++ lib/webrick/httputils.rb	11 Oct 2004 21:02:01 -0000
@@ -115,10 +115,9 @@     def load_mime_types(file)
     module_function :load_mime_types
 
     def mime_type(filename, mime_tab)
-      if suffix = (/\.(\w+)$/ =~ filename && $1)
-        mtype = mime_tab[suffix.downcase]
-      end
-      mtype || "application/octet-stream"
+      suffix1 = (/\.(\w+)$/ =~ filename && $1.downcase)
+      suffix2 = (/\.(\w+)\.[\w\-]+$/ =~ filename && $1.downcase)
+      mime_tab[suffix1] || mime_tab[suffix2] || "application/octet-stream"
     end
     module_function :mime_type
 
@@ -174,6 +173,24 @@     def parse_range_header(ranges_specif
       end
     end
     module_function :parse_range_header
+
+    def parse_qvalues(value)
+      tmp = []
+      if value
+        parts = value.split(/,\s*/)
+        parts.each {|part|
+          if m = %r{^([^\s,]+?)(?:;\s*q=([\d]+(?:\.[\d]+)))?$}.match(part)
+            lang = m[1]
+            q = (m[2] or 1).to_f
+            tmp.push([lang, q])
+          end
+        }
+        tmp = tmp.sort_by{|lang, q| -q}
+        tmp.collect!{|lang, q| lang}
+      end
+      return tmp
+    end
+    module_function :parse_qvalues
 
     #####
 
Index: lib/webrick/httpservlet/filehandler.rb
===================================================================
RCS file: /var/cvs/src/ruby/lib/webrick/httpservlet/filehandler.rb,v
retrieving revision 1.4
diff -u -p -F^[^A-Za-z0-9_+-]*\(class\|module\|def\)[^A-Za-z0-9_+-] -r1.4 filehandler.rb
--- lib/webrick/httpservlet/filehandler.rb	16 Sep 2004 09:14:09 -0000	1.4
+++ lib/webrick/httpservlet/filehandler.rb	11 Oct 2004 21:02:01 -0000
@@ -126,7 +126,7 @@       def prepare_range(range, filesize)
     end
 
     class FileHandler < AbstractServlet
-      HandlerTable = Hash.new(DefaultFileHandler)
+      HandlerTable = Hash.new
 
       def self.add_handler(suffix, handler)
         HandlerTable[suffix] = handler
@@ -201,8 +201,7 @@       def do_OPTIONS(req, res)
       def exec_handler(req, res)
         raise HTTPStatus::NotFound, "`#{req.path}' not found" unless @root
         if set_filename(req, res)
-          suffix = (/\.(\w+)$/ =~ res.filename) && $1
-          handler = @options[:HandlerTable][suffix] || HandlerTable[suffix]
+          handler = get_handler(req)
           call_callback(:HandlerCallback, req, res)
           h = handler.get_instance(@config, res.filename)
           h.service(req, res)
@@ -212,39 +211,93 @@       def exec_handler(req, res)
         return false
       end
 
+      def get_handler(req)
+        suffix1 = (/\.(\w+)$/ =~ req.script_name) && $1.downcase
+        suffix2 = (/\.(\w+)\.[\w\-]+$/ =~ req.script_name) && $1.downcase
+        handler_table = @options[:HandlerTable]
+        return handler_table[suffix1] || handler_table[suffix2] ||
+               HandlerTable[suffix1] || HandlerTable[suffix2] ||
+               DefaultFileHandler
+      end
+
       def set_filename(req, res)
-        handler = nil
         res.filename = @root.dup
         path_info = req.path_info.scan(%r|/[^/]*|)
 
-        while name = path_info.shift
-          if name == "/"
-            indices = @config[:DirectoryIndex]
-            index = indices.find{|i| FileTest::file?("#{res.filename}/#{i}") }
-            name = "/#{index}" if index
-          end
-          res.filename << name
-          req.script_name << name
-          req.path_info = path_info.join
-
-          if File::fnmatch("/#{@options[:NondisclosureName]}", name)
-            @logger.log(Log::WARN,
-               "the request refers nondisclosure name `#{name}'.")
-            raise HTTPStatus::Forbidden, "`#{req.path}' not found."
-          end
-          st = (File::stat(res.filename) rescue nil)
-          raise HTTPStatus::NotFound, "`#{req.path}' not found." unless st
-          raise HTTPStatus::Forbidden,
-            "no access permission to `#{req.path}'." unless st.readable?
-
-          if st.directory?
-            call_callback(:DirectoryCallback, req, res)
-          else
+        path_info.unshift("")  # dummy for checking @root dir
+        while base = path_info.first
+          check_filename(base)
+          break if base == "/"
+          break unless File.directory?(res.filename + base)
+          shift_path_info(req, res, path_info)
+          call_callback(:DirectoryCallback, req, res)
+        end
+
+        if base = path_info.first
+          check_filename(base)
+          if base == "/"
+            if file = search_index_file(req, res)
+              shift_path_info(req, res, path_info, file)
+              call_callback(:FileCallback, req, res)
+              return true
+            end
+            shift_path_info(req, res, path_info)
+          elsif file = search_file(req, res, base)
+            shift_path_info(req, res, path_info, file)
             call_callback(:FileCallback, req, res)
             return true
+          else
+            raise HTTPStatus::NotFound, "`#{req.path}' not found."
           end
         end
+
         return false
+      end
+
+      def check_filename(name)
+        if File.fnmatch("/#{@options[:NondisclosureName]}", name)
+          @logger.warn("the request refers nondisclosure name `#{name}'.")
+          raise HTTPStatus::NotFound, "`#{req.path}' not found."
+        end
+      end
+
+      def shift_path_info(req, res, path_info, base=nil)
+        tmp = path_info.shift
+        base = base || tmp
+        req.path_info = path_info.join
+        req.script_name << base
+        res.filename << base
+      end
+
+      def search_index_file(req, res)
+        @config[:DirectoryIndex].each{|index|
+          if file = search_file(req, res, "/"+index)
+            return file
+          end
+        }
+        return nil
+      end
+
+      def search_file(req, res, basename)
+        langs = @options[:AcceptableLanguages]
+        path = res.filename + basename
+        if File.file?(path)
+          return basename
+        elsif langs.size > 0
+          req.accept_language.each{|lang|
+            path_with_lang = path + ".#{lang}"
+            if langs.member?(lang) && File.file?(path_with_lang)
+              return basename + ".#{lang}"
+            end
+          }
+          (langs - req.accept_language).each{|lang|
+            path_with_lang = path + ".#{lang}"
+            if File.file?(path_with_lang)
+              return basename + ".#{lang}"
+            end
+          }
+        end
+        return nil
       end
 
       def call_callback(callback_name, req, res)